[
  {
    "path": ".gitignore",
    "content": "*/*.data\n*/data\n*/*.pt\n*/*.swp\n*/*.txt\n*/*.png\n*/*.dat\n*/tmp\n*.swp\n*.txt\n*.png\n*.gif\n*.dat\nloss*\ngrads_*\n__pycache__/*\n*/__pycache__/*\n*/__pycache__/\ntorchmod*\nparams*\ntmp*/\n"
  },
  {
    "path": "LICENSE",
    "content": "\"License\" shall mean the terms and conditions for use, reproduction, and distribution as defined by the text below.\n \n\"You\" (or \"Your\") shall mean an individual or Legal Entity exercising permissions granted by this License.\n \n\"Legal Entity\" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, \"control\" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity.\n \n\"Source\" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files.\n \n\"Object\" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types.\n \n\"Work\" shall mean the work of authorship, whether in Source or Object form, made available under this License.\n \nThis License governs use of the accompanying Work, and your use of the Work constitutes acceptance of this License.\n \nYou may use this Work for any non-commercial purpose, subject to the restrictions in this License. Some purposes which can be non-commercial are teaching, academic research, and personal experimentation. You may also distribute this Work with books or other teaching materials, or publish the Work on websites, that are intended to teach the use of the Work.\n \nYou may not use or distribute this Work, or any derivative works, outputs, or results from the Work, in any form for commercial purposes. Non-exhaustive examples of commercial purposes would be running business operations, licensing, leasing, or selling the Work, or distributing the Work for use with commercial products.\n \nYou may modify this Work and distribute the modified Work for non-commercial purposes, however, you may not grant rights to the Work or derivative works that are broader than or in conflict with those provided by this License. For example, you may not distribute modifications of the Work under terms that would permit commercial use, or under terms that purport to require the Work or derivative works to be sublicensed to others.\n\nIn return, we require that you agree:\n\n1. Not to remove any copyright or other notices from the Work.\n \n2. That if you distribute the Work in Source or Object form, you will include a verbatim copy of this License.\n \n3. That if you distribute derivative works of the Work in Source form, you do so only under a license that includes all of the provisions of this License and is not in conflict with this License, and if you distribute derivative works of the Work solely in Object form you do so only under a license that complies with this License.\n \n4. That if you have modified the Work or created derivative works from the Work, and distribute such modifications or derivative works, you will cause the modified files to carry prominent notices so that recipients know that they are not receiving the original Work. Such notices must state: (i) that you have changed the Work; and (ii) the date of any changes.\n \n5. If you publicly use the Work or any output or result of the Work, you will provide a notice with such use that provides any person who uses, views, accesses, interacts with, or is otherwise exposed to the Work (i) with information of the nature of the Work, (ii) with a link to the Work, and (iii) a notice that the Work is available under this License.\n \n6. THAT THE WORK COMES \"AS IS\", WITH NO WARRANTIES. THIS MEANS NO EXPRESS, IMPLIED OR STATUTORY WARRANTY, INCLUDING WITHOUT LIMITATION, WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE OR ANY WARRANTY OF TITLE OR NON-INFRINGEMENT. ALSO, YOU MUST PASS THIS DISCLAIMER ON WHENEVER YOU DISTRIBUTE THE WORK OR DERIVATIVE WORKS.\n \n7. THAT NEITHER UBER TECHNOLOGIES, INC. NOR ANY OF ITS AFFILIATES, SUPPLIERS, SUCCESSORS, NOR ASSIGNS WILL BE LIABLE FOR ANY DAMAGES RELATED TO THE WORK OR THIS LICENSE, INCLUDING DIRECT, INDIRECT, SPECIAL, CONSEQUENTIAL OR INCIDENTAL DAMAGES, TO THE MAXIMUM EXTENT THE LAW PERMITS, NO MATTER WHAT LEGAL THEORY IT IS BASED ON. ALSO, YOU MUST PASS THIS LIMITATION OF LIABILITY ON WHENEVER YOU DISTRIBUTE THE WORK OR DERIVATIVE WORKS.\n \n8. That if you sue anyone over patents that you think may apply to the Work or anyone's use of the Work, your license to the Work ends automatically.\n \n9. That your rights under the License end automatically if you breach it in any way.\n \n10. Uber Technologies, Inc. reserves all rights not expressly granted to you in this License.\n\n\n\n"
  },
  {
    "path": "NOTICE.md",
    "content": "The `awd-lstm-lm` directory (language modelling with plastic LSTMs) was forked\nfrom the [Salesforce Language Model\nToolkit](https://github.com/salesforce/awd-lstm-lm/), which implements the\nbaseline language modelling system used in our experiments (this baseline is\nthe model described in [Merity et al. (2017), Regularizing and Optimizing LSTM\nLanguage Models](https://arxiv.org/abs/1708.02182).\n\nLicense for the Salesforce Language Model Toolkit: \n\nCopyright (c) 2017, \nAll rights reserved.\n\nRedistribution and use in source and binary forms, with or without\nmodification, are permitted provided that the following conditions are met:\n\n* Redistributions of source code must retain the above copyright notice, this\n  list of conditions and the following disclaimer.\n\n* Redistributions in binary form must reproduce the above copyright notice,\n  this list of conditions and the following disclaimer in the documentation\n  and/or other materials provided with the distribution.\n\n* Neither the name of the copyright holder nor the names of its\n  contributors may be used to endorse or promote products derived from\n  this software without specific prior written permission.\n\nTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\"\nAND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE\nIMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE\nDISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE\nFOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL\nDAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR\nSERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER\nCAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,\nOR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\nOF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n"
  },
  {
    "path": "README.md",
    "content": "## Differentiable plasticity\n\nThis repo contains implementations of the algorithms described in [Differentiable plasticity: training plastic networks with gradient descent](https://arxiv.org/abs/1804.02464), a research paper from Uber AI Labs.\n\nNOTE: please see also our more recent work on differentiable *neuromodulated* plasticity: the \"[backpropamine](https://github.com/uber-research/backpropamine)\" framework.\n\nThere are four different experiments included here:\n\n- `simple`: Binary pattern memorization and completion. Read this one first!\n- `images`: Natural image memorization and completion\n- `omniglot`: One-shot learning in the Omniglot task\n- `maze`: Maze exploration task (reinforcement learning)\n\n\nWe strongly recommend studying the `simple/simplest.py` program first, as it is deliberately kept as simple as possible while showing full-fledged differentiable plasticity learning.\n\nThe code requires Python 3 and PyTorch 0.3.0 or later. The `images` code also requires scikit-learn. By default our code requires a GPU, but most programs can be run on CPU by simply uncommenting the relevant lines (for others, remove all occurrences of `.cuda()`).\n\nTo comment, please open an issue. We will not be accepting pull requests but encourage further study of this research. To learn more, check out our accompanying article on the [Uber Engineering Blog](https://eng.uber.com/differentiable-plasticity).\n\n## Copyright and licensing information\n\nCopyright (c) 2018-2019 Uber Technologies, Inc.\n\nAll code is licensed under the Uber Non-Commercial License (the \"License\");\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at the root directory of this project. \n\nSee the LICENSE file in this repository for the specific language governing \npermissions and limitations under the License. \n\n"
  },
  {
    "path": "awd-lstm-lm/.gitignore",
    "content": "maintmp.py\nHDFS/\n*.patch\nmodel_*\nresults_*\n*.pt\n*.swp\n__pycache__/\ndata/\ncorpus*\n"
  },
  {
    "path": "awd-lstm-lm/LICENSE",
    "content": "BSD 3-Clause License\n\nCopyright (c) 2017, \nAll rights reserved.\n\nRedistribution and use in source and binary forms, with or without\nmodification, are permitted provided that the following conditions are met:\n\n* Redistributions of source code must retain the above copyright notice, this\n  list of conditions and the following disclaimer.\n\n* Redistributions in binary form must reproduce the above copyright notice,\n  this list of conditions and the following disclaimer in the documentation\n  and/or other materials provided with the distribution.\n\n* Neither the name of the copyright holder nor the names of its\n  contributors may be used to endorse or promote products derived from\n  this software without specific prior written permission.\n\nTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\"\nAND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE\nIMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE\nDISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE\nFOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL\nDAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR\nSERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER\nCAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,\nOR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\nOF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n"
  },
  {
    "path": "awd-lstm-lm/OpusHdfsCopy.py",
    "content": "import os\nimport os.path\n\ndef checkHdfs():\n    return os.path.isfile('/opt/hadoop/latest/bin/hdfs')\n\ndef transferFileToHdfsPath(sourcepath, targetpath):\n    hdfspath = targetpath\n    targetdir = os.path.dirname(targetpath)\n    os.system('/opt/hadoop/latest/bin/hdfs dfs -mkdir -p {}'.format(targetdir))\n    result = os.system(\n        '/opt/hadoop/latest/bin/hdfs dfs -copyFromLocal -f {} {}'.format(sourcepath, hdfspath)\n    )\n    if result != 0:\n        raise OSError('Cannot copyFromLocal {} {} returned {}'.format(sourcepath, hdfspath, result))\n\ndef transferFileToHdfsDir(sourcepath, targetdir):\n    hdfspath = os.path.join(targetdir, os.path.basename(sourcepath))\n    os.system('/opt/hadoop/latest/bin/hdfs dfs -mkdir -p {}'.format(targetdir))\n    result = os.system(\n        '/opt/hadoop/latest/bin/hdfs dfs -copyFromLocal -f {} {}'.format(sourcepath, hdfspath)\n    )\n    if result != 0:\n        raise OSError('Cannot copyFromLocal {} {} returned {}'.format(sourcepath, hdfspath, result))\n\n"
  },
  {
    "path": "awd-lstm-lm/OpusPrepare.sh",
    "content": "cd /home/work\n\n#  $HOME is not the same as ~ !!!!\n\n# Installing pyenv and putting it in the path\ncurl -L https://raw.githubusercontent.com/yyuu/pyenv-installer/master/bin/pyenv-installer | bash\necho \"HOME is $HOME\"\necho 'export PATH=\"$HOME/.pyenv/bin:$PATH\"\neval \"$(pyenv init -)\"\neval \"$(pyenv virtualenv-init -)\"\n' > $HOME/.bashrc\n\n# Installing python 3.5 and making it default\nsource $HOME/.bashrc\npyenv install 3.5.2\npyenv local 3.5.2\n# Note: when we exit the script, environments go away and we need to re-source ~/.bashrc and re-run pyenv local 3.5.2\n\n# Installing numpy and PyTorch\npip install numpy==1.14\npip install torch\n\napt-get install unzip  # Some machines seem not to have it?\n\n# Downloading the data\nsh ./getdata.sh\n\n"
  },
  {
    "path": "awd-lstm-lm/README.md",
    "content": "# LSTMs with neuromodulated plasticity\n\n\nThis code implements language modelling on the Penn Treebank dataset, using LSTMs with neuromodulated plasticity (\"backpropamine\"), as described in [Backpropamine: training self-modifying neural networks with differentiable neuromodulated plasticity (Miconi et al., ICLR 2016)](https://openreview.net/forum?id=r1lrAiA5Ym), a paper from Uber AI labs.\n\nThe code is forked from [Salesforce Language model toolkit](https://github.com/Smerity/awd-lstm-lm) and uses most of their parameters and design choices. The main differences are that we do not implement DropConnect and reduce batch size to 6 for computational reasons. This code requires Python 3 and PyTorch 1.0.\n\nTo comment, please open an issue. Note that the code is provided \"as is\": we cannot provide support or accept pull requests at this time.\n\n## Usage\n\nBefore running this code, run `getdata.sh` to obtain the Penn Treebank data.\n\nPlasticity and neuromodulation: `python3 main.py --batch_size 6 --data data/penn --dropouti 0.4 --dropouth 0.25  --epoch 500 --save PTB.pt --wdrop 0 --model PLASTICLSTM --modultype modplasth2mod --modulout fanout --nhid 1149  --alphatype perneuron --asgdtime 125 --agdiv 1149`\n\nPlasticity without neuromodulation: `python3 main.py --batch_size 6 --data data/penn --dropouti 0.4 --dropouth 0.25  --epoch 500 --save PTB.pt --wdrop 0 --model PLASTICLSTM --modultype none --modulout none --nhid 1149  --alphatype perneuron --asgdtime 125 --agdiv 1149`\n\nNo plasticity, just plain LSTM: `python3 main.py --batch_size 6 --data data/penn --dropouti 0.4 --dropouth 0.25  --epoch 500 --save PTB.pt --wdrop 0 --model MYLSTM --modultype modplasth2mod --modulout fanout --nhid 1150  --alphatype full --asgdtime 125 --agdiv 1150`\n\nNote that in all of the above, we use per-neuron plasticity coefficients and reduce the number of neurons in plastic LSTMs (`nhid`) to ensure that plastic LSTMs do not have more trainable parameters.\n\n## Code organization.\n\nThe main program is `main.py`. There is some interface code in `model.py`. The code for actual plastic LSTMs is in `mylstm.py`.\n\n## Plastic LSTMs\n\nThe code for plastic LSTMs is relatively straightforward, as can be seen in `mylstm.py`.\n\nHowever, note that in `main.py` we selectively reduce the gradient for `alpha`\nparameters when using plastic LSTMs with either per-neuron or single `alpha`.\nMore precisely, we divide the gradient on `alpha` coefficients by a value that should be roughly equal\nto the number of neurons in the LSTM. This greatly enhances stability without\nforcing a reduction in learning rates.\n\n\n\n"
  },
  {
    "path": "awd-lstm-lm/TESTCOMMAND",
    "content": "python test.py  --model MYLSTM --nhid 1150 --file ./HDFS/ptb/model__SqUsq_MYLSTM_clip_cv2.0_modplasth2mod_fanout_i2c_perneuron_asgdtime125_agdiv1150_lr30_3l_1150h_0.5lstm_rngseed1.dat\n"
  },
  {
    "path": "awd-lstm-lm/data.py",
    "content": "import os\nimport torch\n\nfrom collections import Counter\n\n\nclass Dictionary(object):\n    def __init__(self):\n        self.word2idx = {}\n        self.idx2word = []\n        self.counter = Counter()\n        self.total = 0\n\n    def add_word(self, word):\n        if word not in self.word2idx:\n            self.idx2word.append(word)\n            self.word2idx[word] = len(self.idx2word) - 1\n        token_id = self.word2idx[word]\n        self.counter[token_id] += 1\n        self.total += 1\n        return self.word2idx[word]\n\n    def __len__(self):\n        return len(self.idx2word)\n\n\nclass Corpus(object):\n    def __init__(self, path):\n        self.dictionary = Dictionary()\n        self.train = self.tokenize(os.path.join(path, 'train.txt'))\n        self.valid = self.tokenize(os.path.join(path, 'valid.txt'))\n        self.test = self.tokenize(os.path.join(path, 'test.txt'))\n\n    def tokenize(self, path):\n        \"\"\"Tokenizes a text file.\"\"\"\n        assert os.path.exists(path)\n        # Add words to the dictionary\n        with open(path, 'r') as f:\n            tokens = 0\n            for line in f:\n                words = line.split() + ['<eos>']\n                tokens += len(words)\n                for word in words:\n                    self.dictionary.add_word(word)\n\n        # Tokenize file content\n        with open(path, 'r') as f:\n            ids = torch.LongTensor(tokens)\n            token = 0\n            for line in f:\n                words = line.split() + ['<eos>']\n                for word in words:\n                    ids[token] = self.dictionary.word2idx[word]\n                    token += 1\n\n        return ids\n"
  },
  {
    "path": "awd-lstm-lm/embed_regularize.py",
    "content": "import numpy as np\nimport pdb\nimport torch\n\ndef embedded_dropout(embed, words, dropout=0.1, scale=None):\n  if dropout:\n    mask = embed.weight.data.new().resize_((embed.weight.size(0), 1)).bernoulli_(1 - dropout).expand_as(embed.weight) / (1 - dropout)\n    masked_embed_weight = mask * embed.weight\n  else:\n    masked_embed_weight = embed.weight\n  if scale:\n    masked_embed_weight = scale.expand_as(masked_embed_weight) * masked_embed_weight\n\n  padding_idx = embed.padding_idx\n  if padding_idx is None:\n      padding_idx = -1\n\n  X = torch.nn.functional.embedding(words, masked_embed_weight,\n    padding_idx, embed.max_norm, embed.norm_type,\n    embed.scale_grad_by_freq, embed.sparse\n  )\n  return X\n\nif __name__ == '__main__':\n  V = 50\n  h = 4\n  bptt = 10\n  batch_size = 2\n\n  embed = torch.nn.Embedding(V, h)\n\n  words = np.random.random_integers(low=0, high=V-1, size=(batch_size, bptt))\n  words = torch.LongTensor(words)\n\n  origX = embed(words)\n  X = embedded_dropout(embed, words)\n\n  print(origX)\n  print(X)\n"
  },
  {
    "path": "awd-lstm-lm/finetune.py",
    "content": "import argparse\nimport time\nimport math\nimport numpy as np\nnp.random.seed(331)\nimport torch\nimport torch.nn as nn\n\nimport data\nimport model\n\nfrom utils import batchify, get_batch, repackage_hidden\n\nparser = argparse.ArgumentParser(description='PyTorch PennTreeBank RNN/LSTM Language Model')\nparser.add_argument('--data', type=str, default='data/penn/',\n                    help='location of the data corpus')\nparser.add_argument('--model', type=str, default='LSTM',\n                    help='type of recurrent net (RNN_TANH, RNN_RELU, LSTM, GRU)')\nparser.add_argument('--emsize', type=int, default=400,\n                    help='size of word embeddings')\nparser.add_argument('--nhid', type=int, default=1150,\n                    help='number of hidden units per layer')\nparser.add_argument('--nlayers', type=int, default=3,\n                    help='number of layers')\nparser.add_argument('--lr', type=float, default=30,\n                    help='initial learning rate')\nparser.add_argument('--clip', type=float, default=0.25,\n                    help='gradient clipping')\nparser.add_argument('--epochs', type=int, default=8000,\n                    help='upper epoch limit')\nparser.add_argument('--batch_size', type=int, default=80, metavar='N',\n                    help='batch size')\nparser.add_argument('--bptt', type=int, default=70,\n                    help='sequence length')\nparser.add_argument('--dropout', type=float, default=0.4,\n                    help='dropout applied to layers (0 = no dropout)')\nparser.add_argument('--dropouth', type=float, default=0.3,\n                    help='dropout for rnn layers (0 = no dropout)')\nparser.add_argument('--dropouti', type=float, default=0.65,\n                    help='dropout for input embedding layers (0 = no dropout)')\nparser.add_argument('--dropoute', type=float, default=0.1,\n                    help='dropout to remove words from embedding layer (0 = no dropout)')\nparser.add_argument('--wdrop', type=float, default=0.5,\n                    help='amount of weight dropout to apply to the RNN hidden to hidden matrix')\nparser.add_argument('--tied', action='store_false',\n                    help='tie the word embedding and softmax weights')\nparser.add_argument('--seed', type=int, default=1111,\n                    help='random seed')\nparser.add_argument('--nonmono', type=int, default=5,\n                    help='random seed')\nparser.add_argument('--cuda', action='store_false',\n                    help='use CUDA')\nparser.add_argument('--log-interval', type=int, default=200, metavar='N',\n                    help='report interval')\nrandomhash = ''.join(str(time.time()).split('.'))\nparser.add_argument('--save', type=str,  default=randomhash+'.pt',\n                    help='path to save the final model')\nparser.add_argument('--alpha', type=float, default=2,\n                    help='alpha L2 regularization on RNN activation (alpha = 0 means no regularization)')\nparser.add_argument('--beta', type=float, default=1,\n                    help='beta slowness regularization applied on RNN activiation (beta = 0 means no regularization)')\nparser.add_argument('--wdecay', type=float, default=1.2e-6,\n                    help='weight decay applied to all weights')\nargs = parser.parse_args()\n\n# Set the random seed manually for reproducibility.\ntorch.manual_seed(args.seed)\nif torch.cuda.is_available():\n    if not args.cuda:\n        print(\"WARNING: You have a CUDA device, so you should probably run with --cuda\")\n    else:\n        torch.cuda.manual_seed(args.seed)\n\n###############################################################################\n# Load data\n###############################################################################\n\ncorpus = data.Corpus(args.data)\n\neval_batch_size = 10\ntest_batch_size = 1\ntrain_data = batchify(corpus.train, args.batch_size, args)\nval_data = batchify(corpus.valid, eval_batch_size, args)\ntest_data = batchify(corpus.test, test_batch_size, args)\n\n###############################################################################\n# Build the model\n###############################################################################\n\nntokens = len(corpus.dictionary)\nmodel = model.RNNModel(args.model, ntokens, args.emsize, args.nhid, args.nlayers, args.dropout, args.dropouth, args.dropouti, args.dropoute, args.wdrop, args.tied)\nif args.cuda:\n    model.cuda()\ntotal_params = sum(x.size()[0] * x.size()[1] if len(x.size()) > 1 else x.size()[0] for x in model.parameters())\nprint('Args:', args)\nprint('Model total parameters:', total_params)\n\ncriterion = nn.CrossEntropyLoss()\n\n###############################################################################\n# Training code\n###############################################################################\n\ndef evaluate(data_source, batch_size=10):\n    # Turn on evaluation mode which disables dropout.\n    if args.model == 'QRNN': model.reset()\n    model.eval()\n    total_loss = 0\n    ntokens = len(corpus.dictionary)\n    hidden = model.init_hidden(batch_size)\n    for i in range(0, data_source.size(0) - 1, args.bptt):\n        data, targets = get_batch(data_source, i, args, evaluation=True)\n        output, hidden = model(data, hidden)\n        output_flat = output.view(-1, ntokens)\n        total_loss += len(data) * criterion(output_flat, targets).data\n        hidden = repackage_hidden(hidden)\n    return total_loss[0] / len(data_source)\n\n\ndef train():\n    # Turn on training mode which enables dropout.\n    if args.model == 'QRNN': model.reset()\n    total_loss = 0\n    start_time = time.time()\n    ntokens = len(corpus.dictionary)\n    hidden = model.init_hidden(args.batch_size)\n    batch, i = 0, 0\n    while i < train_data.size(0) - 1 - 1:\n        bptt = args.bptt if np.random.random() < 0.95 else args.bptt / 2.\n        # Prevent excessively small or negative sequence lengths\n        seq_len = max(5, int(np.random.normal(bptt, 5)))\n        # There's a very small chance that it could select a very long sequence length resulting in OOM\n        seq_len = min(seq_len, args.bptt + 10)\n\n        lr2 = optimizer.param_groups[0]['lr']\n        optimizer.param_groups[0]['lr'] = lr2 * seq_len / args.bptt\n        model.train()\n        data, targets = get_batch(train_data, i, args, seq_len=seq_len)\n\n        # Starting each batch, we detach the hidden state from how it was previously produced.\n        # If we didn't, the model would try backpropagating all the way to start of the dataset.\n        hidden = repackage_hidden(hidden)\n        optimizer.zero_grad()\n\n        output, hidden, rnn_hs, dropped_rnn_hs = model(data, hidden, return_h=True)\n        raw_loss = criterion(output.view(-1, ntokens), targets)\n\n        loss = raw_loss\n        # Activiation Regularization\n        loss = loss + sum(args.alpha * dropped_rnn_h.pow(2).mean() for dropped_rnn_h in dropped_rnn_hs[-1:])\n        # Temporal Activation Regularization (slowness)\n        loss = loss + sum(args.beta * (rnn_h[1:] - rnn_h[:-1]).pow(2).mean() for rnn_h in rnn_hs[-1:])\n        loss.backward()\n\n        # `clip_grad_norm` helps prevent the exploding gradient problem in RNNs / LSTMs.\n        torch.nn.utils.clip_grad_norm(model.parameters(), args.clip)\n        optimizer.step()\n\n        total_loss += raw_loss.data\n        optimizer.param_groups[0]['lr'] = lr2\n        if batch % args.log_interval == 0 and batch > 0:\n            cur_loss = total_loss[0] / args.log_interval\n            elapsed = time.time() - start_time\n            print('| epoch {:3d} | {:5d}/{:5d} batches | lr {:02.2f} | ms/batch {:5.2f} | '\n                    'loss {:5.2f} | ppl {:8.2f}'.format(\n                epoch, batch, len(train_data) // args.bptt, optimizer.param_groups[0]['lr'],\n                elapsed * 1000 / args.log_interval, cur_loss, math.exp(cur_loss)))\n            total_loss = 0\n            start_time = time.time()\n        ###\n        batch += 1\n        i += seq_len\n\n\n# Load the best saved model.\nwith open(args.save, 'rb') as f:\n    model = torch.load(f)\n\n\n# Loop over epochs.\nlr = args.lr\nstored_loss = evaluate(val_data)\nbest_val_loss = []\n# At any point you can hit Ctrl + C to break out of training early.\ntry:\n    #optimizer = torch.optim.ASGD(model.parameters(), lr=args.lr, weight_decay=args.wdecay)\n    optimizer = torch.optim.ASGD(model.parameters(), lr=args.lr, t0=0, lambd=0., weight_decay=args.wdecay)\n    for epoch in range(1, args.epochs+1):\n        epoch_start_time = time.time()\n        train()\n        if 't0' in optimizer.param_groups[0]:\n            tmp = {}\n            for prm in model.parameters():\n                tmp[prm] = prm.data.clone()\n                prm.data = optimizer.state[prm]['ax'].clone()\n\n            val_loss2 = evaluate(val_data)\n            print('-' * 89)\n            print('| end of epoch {:3d} | time: {:5.2f}s | valid loss {:5.2f} | '\n                    'valid ppl {:8.2f}'.format(epoch, (time.time() - epoch_start_time),\n                                               val_loss2, math.exp(val_loss2)))\n            print('-' * 89)\n\n            if val_loss2 < stored_loss:\n                with open(args.save, 'wb') as f:\n                    torch.save(model, f)\n                print('Saving Averaged!')\n                stored_loss = val_loss2\n\n            for prm in model.parameters():\n                prm.data = tmp[prm].clone()\n\n        if (len(best_val_loss)>args.nonmono and val_loss2 > min(best_val_loss[:-args.nonmono])):\n            print('Done!')\n            import sys\n            sys.exit(1)\n            optimizer = torch.optim.ASGD(model.parameters(), lr=args.lr, t0=0, lambd=0., weight_decay=args.wdecay)\n            #optimizer.param_groups[0]['lr'] /= 2.\n        best_val_loss.append(val_loss2)\n\nexcept KeyboardInterrupt:\n    print('-' * 89)\n    print('Exiting from training early')\n\n# Load the best saved model.\nwith open(args.save, 'rb') as f:\n    model = torch.load(f)\n    \n# Run on test data.\ntest_loss = evaluate(test_data, test_batch_size)\nprint('=' * 89)\nprint('| End of training | test loss {:5.2f} | test ppl {:8.2f}'.format(\n    test_loss, math.exp(test_loss)))\nprint('=' * 89)\n"
  },
  {
    "path": "awd-lstm-lm/generate.py",
    "content": "###############################################################################\n# Language Modeling on Penn Tree Bank\n#\n# This file generates new sentences sampled from the language model\n#\n###############################################################################\n\nimport argparse\n\nimport torch\nfrom torch.autograd import Variable\n\nimport data\n\nparser = argparse.ArgumentParser(description='PyTorch PTB Language Model')\n\n# Model parameters.\nparser.add_argument('--data', type=str, default='./data/penn',\n                    help='location of the data corpus')\nparser.add_argument('--model', type=str, default='LSTM',\n                    help='type of recurrent net (LSTM, QRNN)')\nparser.add_argument('--checkpoint', type=str, default='./model.pt',\n                    help='model checkpoint to use')\nparser.add_argument('--outf', type=str, default='generated.txt',\n                    help='output file for generated text')\nparser.add_argument('--words', type=int, default='1000',\n                    help='number of words to generate')\nparser.add_argument('--seed', type=int, default=1111,\n                    help='random seed')\nparser.add_argument('--cuda', action='store_true',\n                    help='use CUDA')\nparser.add_argument('--temperature', type=float, default=1.0,\n                    help='temperature - higher will increase diversity')\nparser.add_argument('--log-interval', type=int, default=100,\n                    help='reporting interval')\nargs = parser.parse_args()\n\n# Set the random seed manually for reproducibility.\ntorch.manual_seed(args.seed)\nif torch.cuda.is_available():\n    if not args.cuda:\n        print(\"WARNING: You have a CUDA device, so you should probably run with --cuda\")\n    else:\n        torch.cuda.manual_seed(args.seed)\n\nif args.temperature < 1e-3:\n    parser.error(\"--temperature has to be greater or equal 1e-3\")\n\nwith open(args.checkpoint, 'rb') as f:\n    model = torch.load(f)\nmodel.eval()\nif args.model == 'QRNN':\n    model.reset()\n\nif args.cuda:\n    model.cuda()\nelse:\n    model.cpu()\n\ncorpus = data.Corpus(args.data)\nntokens = len(corpus.dictionary)\nhidden = model.init_hidden(1)\ninput = Variable(torch.rand(1, 1).mul(ntokens).long(), volatile=True)\nif args.cuda:\n    input.data = input.data.cuda()\n\nwith open(args.outf, 'w') as outf:\n    for i in range(args.words):\n        output, hidden = model(input, hidden)\n        word_weights = output.squeeze().data.div(args.temperature).exp().cpu()\n        word_idx = torch.multinomial(word_weights, 1)[0]\n        input.data.fill_(word_idx)\n        word = corpus.dictionary.idx2word[word_idx]\n\n        outf.write(word + ('\\n' if i % 20 == 19 else ' '))\n\n        if i % args.log_interval == 0:\n            print('| Generated {}/{} words'.format(i, args.words))\n"
  },
  {
    "path": "awd-lstm-lm/getdata.sh",
    "content": "echo \"=== Acquiring datasets ===\"\necho \"---\"\nmkdir -p save\n\nmkdir -p data\ncd data\n\n#echo \"- Downloading WikiText-2 (WT2)\"\n#wget --quiet --continue https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-v1.zip\n#unzip -q wikitext-2-v1.zip\n#cd wikitext-2\n#mv wiki.train.tokens train.txt\n#mv wiki.valid.tokens valid.txt\n#mv wiki.test.tokens test.txt\n#cd ..\n#\n#echo \"- Downloading WikiText-103 (WT2)\"\n#wget --continue https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-103-v1.zip\n#unzip -q wikitext-103-v1.zip\n#cd wikitext-103\n#mv wiki.train.tokens train.txt\n#mv wiki.valid.tokens valid.txt\n#mv wiki.test.tokens test.txt\n#cd ..\n#\n#echo \"- Downloading enwik8 (Character)\"\n#mkdir -p enwik8\n#cd enwik8\n#wget --continue http://mattmahoney.net/dc/enwik8.zip\n#python prep_enwik8.py\n#cd ..\n\necho \"- Downloading Penn Treebank (PTB)\"\nwget --quiet --continue http://www.fit.vutbr.cz/~imikolov/rnnlm/simple-examples.tgz\ntar -xzf simple-examples.tgz\n\nmkdir -p penn\ncd penn\nmv ../simple-examples/data/ptb.train.txt train.txt\nmv ../simple-examples/data/ptb.test.txt test.txt\nmv ../simple-examples/data/ptb.valid.txt valid.txt\ncd ..\n\n#echo \"- Downloading Penn Treebank (Character)\"\n#mkdir -p pennchar\n#cd pennchar\n#mv ../simple-examples/data/ptb.char.train.txt train.txt\n#mv ../simple-examples/data/ptb.char.test.txt test.txt\n#mv ../simple-examples/data/ptb.char.valid.txt valid.txt\n#cd ..\n#\nrm -rf simple-examples/\n\n# echo \"- Downloading WikiText-2 (WT2)\"\n# wget --quiet --continue https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-v1.zip\n# unzip -q wikitext-2-v1.zip\n# cd wikitext-2\n# mv wiki.train.tokens train.txt\n# mv wiki.valid.tokens valid.txt\n# mv wiki.test.tokens test.txt\n# \necho \"---\"\necho \"Happy language modeling :)\"\n"
  },
  {
    "path": "awd-lstm-lm/locked_dropout.py",
    "content": "import torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\n\nclass LockedDropout(nn.Module):\n    def __init__(self):\n        super().__init__()\n\n    def forward(self, x, dropout=0.5):\n        if not self.training or not dropout:\n            return x\n        m = x.data.new(1, x.size(1), x.size(2)).bernoulli_(1 - dropout)\n        mask = Variable(m, requires_grad=False) / (1 - dropout)\n        mask = mask.expand_as(x)\n        return mask * x\n"
  },
  {
    "path": "awd-lstm-lm/main.py",
    "content": "import OpusHdfsCopy\nfrom OpusHdfsCopy import transferFileToHdfsDir, checkHdfs\nimport argparse\nimport time\nimport math\nimport numpy as np\nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\n\nimport pdb\n\nimport data\nimport model\n\nfrom utils import batchify, get_batch, repackage_hidden\n\nparser = argparse.ArgumentParser(description='PyTorch PennTreeBank RNN/LSTM Language Model')\nparser.add_argument('--data', type=str, default='data/penn/',\n                    help='location of the data corpus')\nparser.add_argument('--model', type=str, default='PLASTICLSTM',\n                    help='type of recurrent net (LSTM, QRNN, GRU, PLASTICLSTM, MYLSTM, FASTPLASTICLSTM, SIMPLEPLASTICLSTM)')\nparser.add_argument('--alphatype', type=str, default='full',\n        help=\"type of alpha matrix: (full, perneuron, single)\")\nparser.add_argument('--modultype', type=str, default='none',\n        help=\"type of modulation: (none, modplasth2mod, modplastc2mod)\")\nparser.add_argument('--modulout', type=str, default='single',\n        help=\"modulatory output (single or fanout)\")\nparser.add_argument('--cliptype', type=str, default='clip',\n                    help=\"clip type (decay, clip, aditya)\")\nparser.add_argument('--hebboutput', type=str, default='i2c',\n                    help='output used for hebbian computations (i2c, h2co, cell, hidden)')\nparser.add_argument('--emsize', type=int, default=400,\n                    help='size of word embeddings')\nparser.add_argument('--nhid', type=int, default=1150,\n                    help='number of hidden units per layer')\nparser.add_argument('--nlayers', type=int, default=3,\n                    help='number of layers')\nparser.add_argument('--clipval', type=float, default=2.0,\n                    help='value of the hebbian trace clipping')\nparser.add_argument('--lr', type=float, default=30,\n                    help='initial learning rate')\nparser.add_argument('--agdiv', type=float, default=1150.0,\n                    help='divider of the gradient of alpha')\nparser.add_argument('--clip', type=float, default=0.25,\n                    help='gradient clipping')\nparser.add_argument('--epochs', type=int, default=300,\n                    help='upper epoch limit')\nparser.add_argument('--batch_size', type=int, default=80, metavar='N',\n                    help='batch size')\nparser.add_argument('--bptt', type=int, default=70,\n                    help='sequence length')\nparser.add_argument('--dropout', type=float, default=0.4,\n                    help='dropout applied to layers (0 = no dropout)')\nparser.add_argument('--dropouth', type=float, default=0.3,\n                    help='dropout for rnn layers (0 = no dropout)')\nparser.add_argument('--dropouti', type=float, default=0.65,\n                    help='dropout for input embedding layers (0 = no dropout)')\nparser.add_argument('--dropoute', type=float, default=0.1,\n                    help='dropout to remove words from embedding layer (0 = no dropout)')\nparser.add_argument('--proplstm', type=float, default=0.5,\n                    help='for split-lstms: proportion of LSTM cells in the recurrent layer')\nparser.add_argument('--wdrop', type=float, default=0.5,\n                    help='amount of weight dropout to apply to the RNN hidden to hidden matrix')\nparser.add_argument('--seed', type=int, default=1111,\n                    help='random seed')\nparser.add_argument('--asgdtime', type=int, default=-1,\n                    help='number of iterations before switch to ASGD (if positive)')\nparser.add_argument('--nonmono', type=int, default=5,\n                    help='range of non monotonicity before switch to ASGD (if asgdtime is negative)')\nparser.add_argument('--cuda', action='store_false',\n                    help='use CUDA')\nparser.add_argument('--numgpu', type=int, default=0,\n                    help='which GPU to use? (no effect if GPU not used at all)')\nparser.add_argument('--log-interval', type=int, default=200, metavar='N',\n                    help='report interval')\nrandomhash = ''.join(str(time.time()).split('.'))\nparser.add_argument('--save', type=str,  default=randomhash+'.pt',\n                    help='path to save the final model')\nparser.add_argument('--alpha', type=float, default=2,\n                    help='alpha L2 regularization on RNN activation (alpha = 0 means no regularization)')\nparser.add_argument('--beta', type=float, default=1,\n                    help='beta slowness regularization applied on RNN activiation (beta = 0 means no regularization)')\nparser.add_argument('--wdecay', type=float, default=1.2e-6,\n                    help='weight decay applied to all weights')\nparser.add_argument('--resume', type=str,  default='',\n                    help='path of model to resume')\nparser.add_argument('--optimizer', type=str,  default='sgd',\n                    help='optimizer to use (sgd, adam)')\nparser.add_argument('--when', nargs=\"+\", type=int, default=[-1],\n                    help='When (which epochs) to divide the learning rate by 10 - accepts multiple')\nargs = parser.parse_args()\nargs.tied = True\n\n# Set the random seed manually for reproducibility.\nnp.random.seed(args.seed)\ntorch.manual_seed(args.seed)\nif torch.cuda.is_available():\n    if  not args.cuda :\n        print(\"WARNING: You have a CUDA device, so you should probably run with --cuda\")\n    else:\n        torch.cuda.manual_seed(args.seed)\nelse:\n    print(\"NOTE: no CUDA device detected.\")\n\nimport platform\nprint(\"PyTorch version:\", torch.__version__, \"Numpy version:\", np.version.version, \"Python version:\", platform.python_version(), \"GPU used (if any):\", args.numgpu)\n\n###############################################################################\n# Load data\n###############################################################################\n\ndef model_save(fn):\n    with open(fn, 'wb') as f:\n        torch.save([model, criterion, optimizer], f)\n\ndef model_load(fn):\n    global model, criterion, optimizer\n    with open(fn, 'rb') as f:\n        model, criterion, optimizer = torch.load(f)\n\nimport os\nimport hashlib\nfn = 'corpus.{}.data'.format(hashlib.md5(args.data.encode()).hexdigest())\nif os.path.exists(fn):\n    print('Loading cached dataset...')\n    corpus = torch.load(fn)\nelse:\n    print('Producing dataset...')\n    corpus = data.Corpus(args.data)\n    torch.save(corpus, fn)\n\neval_batch_size = 10\ntest_batch_size = 1\ntrain_data = batchify(corpus.train, args.batch_size, args)\nval_data = batchify(corpus.valid, eval_batch_size, args)\ntest_data = batchify(corpus.test, test_batch_size, args)\n\n\n#train_data = train_data[:5000,:]   # For debugging\n\n###############################################################################\n# Build the model\n###############################################################################\n\nfrom splitcross import SplitCrossEntropyLoss\ncriterion = None\n\nntokens = len(corpus.dictionary)\n\n# Configuration parameters of the plastic LSTM. See mylstm.py for details.\nmyparams={}\nmyparams['clipval'] = args.clipval\nmyparams['cliptype'] = args.cliptype\nmyparams['modultype'] = args.modultype\nmyparams['modulout'] = args.modulout\nmyparams['hebboutput'] = args.hebboutput\nmyparams['alphatype'] = args.alphatype\n\nsuffix = '_SqUsq_'+args.model+'_'+myparams['cliptype']+'_cv'+str(myparams['clipval'])+'_'+myparams['modultype']+'_'+myparams['modulout']+'_'+myparams['hebboutput']+'_'+myparams['alphatype']+'_asgdtime'+str(args.asgdtime)+'_agdiv'+str(int(args.agdiv))+'_lr'+str(args.lr)+'_'+str(args.nlayers)+'l_'+str(args.nhid)+'h_'+str(args.proplstm)+'lstm_rngseed'+str(args.seed)\nprint(\"Suffix:\", suffix)\nMODELFILENAME = 'model_'+suffix+'.dat'\nRESULTSFILENAME = 'results_'+suffix+'.txt'\nFILENAMESTOSAVE = [MODELFILENAME, RESULTSFILENAME]  # We will append to this list the additional files at each learning rate reduction, if any\n\nprint(\"Plasticity and neuromodulation parameters:\", myparams)\nmodel = model.RNNModel(args.model, ntokens, args.emsize, args.nhid, args.proplstm, args.nlayers, args.dropout, args.dropouth, args.dropouti, args.dropoute, args.wdrop, args.tied, myparams)\n###\nif args.resume:\n    print('Resuming model ...')\n    model_load(args.resume)\n    optimizer.param_groups[0]['lr'] = args.lr\n    model.dropouti, model.dropouth, model.dropout, args.dropoute = args.dropouti, args.dropouth, args.dropout, args.dropoute\n    if args.wdrop:\n        from weight_drop import WeightDrop\n        for rnn in model.rnns:\n            if type(rnn) == WeightDrop: rnn.dropout = args.wdrop\n            elif rnn.zoneout > 0: rnn.zoneout = args.wdrop\n###\nif not criterion:\n    splits = []\n    if ntokens > 500000:\n        # One Billion\n        # This produces fairly even matrix mults for the buckets:\n        # 0: 11723136, 1: 10854630, 2: 11270961, 3: 11219422\n        splits = [4200, 35000, 180000]\n    elif ntokens > 75000:\n        # WikiText-103\n        splits = [2800, 20000, 76000]\n    print('Using', splits)\n    criterion = SplitCrossEntropyLoss(args.emsize, splits=splits, verbose=False)\n###\nparams = list(model.parameters()) + list(criterion.parameters())\nif args.cuda:\n    model = model.cuda(args.numgpu)\n    criterion = criterion.cuda(args.numgpu)\n    params = list(model.parameters()) + list(criterion.parameters())\n###\n#total_params = sum(x.size()[0] * x.size()[1] if len(x.size()) > 1 else x.size()[0] for x in params if x.size()) # Smerity version, doesn't work when size==3\ntotal_params = sum(x.numel() for x in params if x.numel())\nprint('Args:', args)\nprint('Model total parameters:', total_params)\n\n\n###############################################################################\n# Training code\n###############################################################################\n\ndef evaluate(data_source, batch_size=10):\n    # Turn on evaluation mode which disables dropout.\n    model.eval()\n    with torch.no_grad():\n        if args.model == 'QRNN': model.reset()\n        total_loss = 0\n        ntokens = len(corpus.dictionary)\n        hidden = model.init_hidden(batch_size)\n        for i in range(0, data_source.size(0) - 1, args.bptt):\n            data, targets = get_batch(data_source, i, args, evaluation=True)\n            output, hidden = model(data, hidden)\n            total_loss += len(data) * criterion(model.decoder.weight, model.decoder.bias, output, targets).data\n            hidden = repackage_hidden(hidden)\n        #return total_loss[0] / len(data_source) # Error under modern PyTorch\n    return total_loss / len(data_source)\n\n\ndef train():\n    # Turn on training mode which enables dropout.\n    if args.model == 'QRNN': model.reset()\n    total_loss = 0\n    start_time = time.time()\n    ntokens = len(corpus.dictionary)\n    hidden = model.init_hidden(args.batch_size)\n    batch, i = 0, 0\n    while i < train_data.size(0) - 1 - 1:\n        bptt = args.bptt if np.random.random() < 0.95 else args.bptt / 2.\n        # Prevent excessively small or negative sequence lengths\n        seq_len = max(5, int(np.random.normal(bptt, 5)))\n        # There's a very small chance that it could select a very long sequence length resulting in OOM\n        # NOTE: this was commented out in smerity's code!\n        seq_len = min(seq_len, args.bptt + 10)\n\n        lr2 = optimizer.param_groups[0]['lr']\n        optimizer.param_groups[0]['lr'] = lr2 * seq_len / args.bptt\n        model.train()\n        data, targets = get_batch(train_data, i, args, seq_len=seq_len)\n\n        # Starting each batch, we detach the hidden state from how it was previously produced.\n        # If we didn't, the model would try backpropagating all the way to start of the dataset.\n        # NOTE: Now 'hidden' includes the Hebbian traces if using plasticity.\n        hidden = repackage_hidden(hidden)\n        optimizer.zero_grad()\n\n        output, hidden, rnn_hs, dropped_rnn_hs = model(data, hidden, return_h=True)\n        raw_loss = criterion(model.decoder.weight, model.decoder.bias, output, targets)\n\n        loss = raw_loss\n        # Activiation Regularization\n        if args.alpha: loss = loss + sum(args.alpha * dropped_rnn_h.pow(2).mean() for dropped_rnn_h in dropped_rnn_hs[-1:])\n        # Temporal Activation Regularization (slowness)\n        if args.beta: loss = loss + sum(args.beta * (rnn_h[1:] - rnn_h[:-1]).pow(2).mean() for rnn_h in rnn_hs[-1:])\n        loss.backward()\n\n        # When using plastic LSTMs, \n        # We divide the gradient on the alphas by the number of inputs, i.e.\n        # the number of recurrent neurons, but only if plasticity is\n        # 'perneuron' or 'single' (as opposed to 'full'). \n        # This is necessary to preserve stability while using the same learning rate as Merity et al.\n        if args.model == 'PLASTICLSTM' or args.model == 'SPLITLSTM' or args.model == 'FASTPLASTICLSTM':\n            if args.alphatype == 'perneuron' or args.alphatype == 'single':  # Based on other experiments, this is actually not good for full-plasticity\n                for x in model.rnns:\n                    if hasattr(x.alpha.grad, 'data'):\n                        x.alpha.grad.data /= args.agdiv\n        \n        # `clip_grad_norm` helps prevent the exploding gradient problem in RNNs / LSTMs.\n        if args.clip: torch.nn.utils.clip_grad_norm(model.parameters(), args.clip)\n        \n        # OPTIMIZATION STEP\n        optimizer.step()\n\n        total_loss += raw_loss.data\n        optimizer.param_groups[0]['lr'] = lr2\n        if batch % args.log_interval == 0 and batch > 0:\n            cur_loss = total_loss / args.log_interval\n            elapsed = time.time() - start_time\n            print('| epoch {:3d} | {:5d}/{:5d} batches | lr {:05.5f} | ms/batch {:5.2f} | '\n                    'loss {:5.2f} | ppl {:8.2f} | bpc {:8.3f}'.format(\n                epoch, batch, len(train_data) // args.bptt, optimizer.param_groups[0]['lr'],\n                elapsed * 1000 / args.log_interval, cur_loss, math.exp(cur_loss), cur_loss / math.log(2)))\n            total_loss = 0\n            start_time = time.time()\n        ###\n        batch += 1\n        i += seq_len\n\n# Loop over epochs.\nlr = args.lr\nbest_val_loss = []\nstored_loss = 100000000\n\n\n# At any point you can hit Ctrl + C to break out of training early.\ntry:\n    optimizer = None\n    if args.optimizer == 'sgd':\n        optimizer = torch.optim.SGD(model.parameters(), lr=args.lr, weight_decay=args.wdecay)\n    if args.optimizer == 'adam':\n        optimizer = torch.optim.Adam(model.parameters(), lr=args.lr, weight_decay=args.wdecay)\n\n    allvallosses = []\n    for epoch in range(1, args.epochs+1):\n        epoch_start_time = time.time()\n        train()\n        if 't0' in optimizer.param_groups[0]:  # Are we in the ASGD regime?\n            tmp = {}\n            for prm in model.parameters():\n                tmp[prm] = prm.data.clone()\n                # NOTE (TM): the following line may cause trouble after the switch to ASGD if some declared pytorch Parameters of the network are not actually used in the computational graph\n                prm.data = optimizer.state[prm]['ax'].clone()\n\n            val_loss2 = evaluate(val_data, eval_batch_size)\n            print('-' * 89)\n            print('| end of epoch {:3d} (t0 on) | time: {:5.2f}s | valid loss {:5.2f} | '\n                'valid ppl {:8.2f} | valloss2 ppl {:8.2f}'.format(\n              epoch, (time.time() - epoch_start_time), val_loss, math.exp(val_loss), math.exp(val_loss2)))\n            print('-' * 89)\n\n            if val_loss2 < stored_loss:\n                model_save(MODELFILENAME)\n                print('Saving Averaged!')\n                stored_loss = val_loss2\n\n            for prm in model.parameters():\n                prm.data = tmp[prm].clone()\n\n            allvallosses.append(val_loss2)\n\n        else:\n            val_loss = evaluate(val_data, eval_batch_size)\n            print('-' * 89)\n            print('| end of epoch {:3d} | time: {:5.2f}s | valid loss {:5.2f} | '\n                'valid ppl {:8.2f} | valid bpc {:8.3f}'.format(\n              epoch, (time.time() - epoch_start_time), val_loss, math.exp(val_loss), val_loss / math.log(2)))\n            print('-' * 89)\n\n            if val_loss < stored_loss:\n                model_save(MODELFILENAME)\n                print('Saving model (new best validation)')\n                stored_loss = val_loss\n\n            if args.optimizer == 'sgd' and 't0' not in optimizer.param_groups[0]:\n                if (args.asgdtime < 0 and len(best_val_loss)>args.nonmono and val_loss > min(best_val_loss[:-args.nonmono])) or (args.asgdtime > 0 and len(best_val_loss) == args.asgdtime) :\n\n                    print('Switching to ASGD')\n                    optimizer = torch.optim.ASGD(model.parameters(), lr=args.lr, t0=0, lambd=0., weight_decay=args.wdecay)\n\n            if epoch in args.when:\n                print('Saving model before learning rate decreased')\n                EPOCHFILENAME = '{}.e{}'.format(MODELFILENAME, epoch)\n                model_save(EPOCHFILENAME)\n                FILENAMESTOSAVE.append(EPOCHFILENAME)\n                print('Dividing learning rate by 10')\n                optimizer.param_groups[0]['lr'] /= 10.\n\n            best_val_loss.append(val_loss)\n            \n            allvallosses.append(val_loss)\n\n        np.savetxt(RESULTSFILENAME, allvallosses)\n\n        # Saving files remotely.... (Uber only!)\n        if os.path.isdir('/mnt/share/tmiconi'):\n            print(\"Transferring to NFS storage...\")\n            for fn in FILENAMESTOSAVE:\n                result = os.system(\n                    'cp {} {}'.format(fn, '/mnt/share/tmiconi/ptb/'+fn))\n            print(\"Done!\")\n        #if checkHdfs():\n        #    print(\"Transfering to HDFS...\")\n        #    for fn in FILENAMESTOSAVE:\n        #        transferFileToHdfsDir(fn, '/ailabs/tmiconi/ptb/')\n\n\nexcept KeyboardInterrupt:\n    print('-' * 89)\n    print('Exiting from training early')\n\n# Load the best saved model.\nmodel_load(MODELFILENAME)\n\n# Run on test data.\ntest_loss = evaluate(test_data, test_batch_size)\nprint('=' * 89)\nprint('| End of training | test loss {:5.2f} | test ppl {:8.2f} | test bpc {:8.3f}'.format(\n    test_loss, math.exp(test_loss), test_loss / math.log(2)))\nprint('=' * 89)\n"
  },
  {
    "path": "awd-lstm-lm/model.py",
    "content": "import torch\nimport torch.nn as nn\n#from torch.autograd import Variable\n\nfrom embed_regularize import embedded_dropout\nfrom locked_dropout import LockedDropout\nfrom weight_drop import WeightDrop\n\nimport random, pdb\n\nimport mylstm\n\nclass RNNModel(nn.Module):\n    \"\"\"Container module with an encoder, a recurrent module, and a decoder.\"\"\"\n\n    def __init__(self, rnn_type, ntoken, ninp, nhid, proplstm, nlayers, dropout=0.5, dropouth=0.5, dropouti=0.5, dropoute=0.1, wdrop=0, tie_weights=False, params={}):\n        super(RNNModel, self).__init__()\n        self.lockdrop = LockedDropout()\n        self.idrop = nn.Dropout(dropouti)\n        self.hdrop = nn.Dropout(dropouth)\n        self.drop = nn.Dropout(dropout)\n        self.encoder = nn.Embedding(ntoken, ninp)\n        assert rnn_type in ['LSTM', 'QRNN', 'GRU', 'MYLSTM', 'MYFASTLSTM', 'SIMPLEPLASTICLSTM', 'FASTPLASTICLSTM', 'PLASTICLSTM', 'SPLITLSTM'], 'RNN type is not supported'\n        if rnn_type == 'LSTM':\n            self.rnns = [torch.nn.LSTM(ninp if l == 0 else nhid, nhid if l != nlayers - 1 else (ninp if tie_weights else nhid), 1, dropout=0) for l in range(nlayers)]\n\n            #for rr in self.rnns:\n            #    rr.flatten_parameters()\n            if wdrop:\n                print(\"Using WeightDrop!\")\n                self.rnns = [WeightDrop(rnn, ['weight_hh_l0'], dropout=wdrop) for rnn in self.rnns]\n\n        elif rnn_type == 'MYLSTM': \n            self.rnns = [mylstm.MyLSTM(ninp if l == 0 else nhid, nhid if l != nlayers - 1 else (ninp if tie_weights else nhid)) for l in range(nlayers)]\n\n        elif rnn_type == 'MYFASTLSTM': \n            self.rnns = [mylstm.MyFastLSTM(ninp if l == 0 else nhid, nhid if l != nlayers - 1 else (ninp if tie_weights else nhid)) for l in range(nlayers)]\n\n        elif rnn_type == 'PLASTICLSTM':\n            self.rnns = [mylstm.PlasticLSTM(ninp if l == 0 else nhid, nhid if l != nlayers - 1 else (ninp if tie_weights else nhid), params) for l in range(nlayers)]\n\n        elif rnn_type == 'SIMPLEPLASTICLSTM':\n            # Note that this one ignores the 'params' argument, which is only kept to preserve identical signature with PlasticLSTM\n            self.rnns = [mylstm.SimplePlasticLSTM(ninp if l == 0 else nhid, nhid if l != nlayers - 1 else (ninp if tie_weights else nhid), params) for l in range(nlayers)]\n\n        elif rnn_type == 'FASTPLASTICLSTM':\n            self.rnns = [mylstm.MyFastPlasticLSTM(ninp if l == 0 else nhid, nhid if l != nlayers - 1 else (ninp if tie_weights else nhid), params) for l in range(nlayers)]\n\n        elif rnn_type == 'SPLITLSTM': # Not used\n            self.rnns = [mylstm.SplitLSTM(ninp if l == 0 else nhid, nhid if l != nlayers - 1 else (ninp if tie_weights else nhid), proplstm, params) for l in range(nlayers)]\n\n        elif rnn_type == 'GRU':\n            self.rnns = [torch.nn.GRU(ninp if l == 0 else nhid, nhid if l != nlayers - 1 else ninp, 1, dropout=0) for l in range(nlayers)]\n            if wdrop:\n                self.rnns = [WeightDrop(rnn, ['weight_hh_l0'], dropout=wdrop) for rnn in self.rnns]\n        elif rnn_type == 'QRNN':\n            from torchqrnn import QRNNLayer\n            self.rnns = [QRNNLayer(input_size=ninp if l == 0 else nhid, hidden_size=nhid if l != nlayers - 1 else (ninp if tie_weights else nhid), save_prev_x=True, zoneout=0, window=2 if l == 0 else 1, output_gate=True) for l in range(nlayers)]\n            for rnn in self.rnns:\n                rnn.linear = WeightDrop(rnn.linear, ['weight'], dropout=wdrop)\n        print(self.rnns)\n        self.rnns = torch.nn.ModuleList(self.rnns)\n        self.decoder = nn.Linear(nhid, ntoken)\n\n        # Optionally tie weights as in:\n        # \"Using the Output Embedding to Improve Language Models\" (Press & Wolf 2016)\n        # https://arxiv.org/abs/1608.05859\n        # and\n        # \"Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling\" (Inan et al. 2016)\n        # https://arxiv.org/abs/1611.01462\n        if tie_weights:\n            #if nhid != ninp:\n            #    raise ValueError('When using the tied flag, nhid must be equal to emsize')\n            self.decoder.weight = self.encoder.weight\n\n        self.init_weights()\n\n        self.rnn_type = rnn_type\n        self.ninp = ninp\n        self.nhid = nhid\n        self.proplstm = proplstm\n        self.nlayers = nlayers\n        self.dropout = dropout\n        self.dropouti = dropouti\n        self.dropouth = dropouth\n        self.dropoute = dropoute\n        self.tie_weights = tie_weights\n\n\n\n    def reset(self):\n        if self.rnn_type == 'QRNN': [r.reset() for r in self.rnns]\n\n    def init_weights(self):\n        initrange = 0.1\n        self.encoder.weight.data.uniform_(-initrange, initrange)\n        self.decoder.bias.data.fill_(0)\n        self.decoder.weight.data.uniform_(-initrange, initrange)\n\n    def forward(self, input, hidden, return_h=False):\n        emb = embedded_dropout(self.encoder, input, dropout=self.dropoute if self.training else 0)\n        #emb = self.idrop(emb)\n\n        emb = self.lockdrop(emb, self.dropouti)\n\n        raw_output = emb\n        new_hidden = []\n        #raw_output, hidden = self.rnn(emb, hidden)\n        raw_outputs = []\n        outputs = []\n        for l, rnn in enumerate(self.rnns):\n            current_input = raw_output\n            # Each rnn is a layer!\n            # each raw_output has shape seq_len x batch_size x nb_hidden\n            # new_h is a tuple of 2 elements, each of size 1 x batch_size x nb_hidden (last h and last c)\n            if self.rnn_type != 'MYLSTM' and self.rnn_type != 'MYFASTLSTM' and self.rnn_type != 'SIMPLEPLASTICLSTM' and self.rnn_type != 'PLASTICLSTM' and self.rnn_type != 'FASTPLASTICLSTM' and self.rnn_type != 'SPLITLSTM':\n                raw_output, new_h = rnn(raw_output, hidden[l])\n            else:\n                single_h = hidden[l]  # actually a tuple, includes the h and the c (and for plastic LTMS, includes Hebb as third element!)\n                singleouts = []\n                for z in range(raw_output.shape[0]):\n                    singleout, single_h = rnn(raw_output[z], single_h)\n                    #if z==0:\n                    #    print(\"RANDOM NUMBER 1:\",float(torch.rand(1)))\n                    singleouts.append(singleout)\n                new_h = single_h  # the last (h,c[,hebb]) after the sequence is processed\n                raw_output = torch.stack(singleouts)\n            new_hidden.append(new_h)\n            raw_outputs.append(raw_output)\n            if l != self.nlayers - 1:\n                #self.hdrop(raw_output)\n                # lockdrop will zero out some output units over the whole sequence (separately chosen for each batch, but fixed across sequence)\n                #pdb.set_trace()\n                raw_output = self.lockdrop(raw_output, self.dropouth)\n                outputs.append(raw_output)\n                #pdb.set_trace()\n        hidden = new_hidden\n        #pdb.set_trace()\n\n        output = self.lockdrop(raw_output, self.dropout)\n        outputs.append(output)\n\n        result = output.view(output.size(0)*output.size(1), output.size(2))\n        if return_h:\n            return result, hidden, raw_outputs, outputs\n        return result, hidden\n\n    def init_hidden(self, bsz):\n        weight = next(self.parameters()).data\n        if self.rnn_type == 'MYLSTM' or self.rnn_type == 'MYFASTLSTM':\n            return [((weight.new(bsz, self.nhid if l != self.nlayers - 1 else (self.ninp if self.tie_weights else self.nhid)).zero_()),\n                    (weight.new(bsz, self.nhid if l != self.nlayers - 1 else (self.ninp if self.tie_weights else self.nhid)).zero_()))\n                    for l in range(self.nlayers)]\n        elif self.rnn_type == 'PLASTICLSTM' or self.rnn_type == 'SIMPLEPLASTICLSTM':\n            return [(\n                    (weight.new(bsz, self.nhid if l != self.nlayers - 1 else (self.ninp if self.tie_weights else self.nhid)).zero_()), # h state\n                    (weight.new(bsz, self.nhid if l != self.nlayers - 1 else (self.ninp if self.tie_weights else self.nhid)).zero_()), # c state\n                    (weight.new(bsz, self.rnns[l].w.shape[0], self.rnns[l].w.shape[1]).zero_()) # hebbian trace for the recurrent weights\n                    #(weight.new(bsz, self.rnns[l].isize, self.rnns[l].hsize).zero_())  # hebbian trace for the input weights (not necessarily used)\n                    )\n                    for l in range(self.nlayers)]\n        elif self.rnn_type == 'FASTPLASTICLSTM':\n            return [(\n                    (weight.new(bsz, self.nhid if l != self.nlayers - 1 else (self.ninp if self.tie_weights else self.nhid)).zero_()), # h state\n                    (weight.new(bsz, self.nhid if l != self.nlayers - 1 else (self.ninp if self.tie_weights else self.nhid)).zero_()), # c state\n                    (weight.new(bsz, self.rnns[l].hsize, self.rnns[l].hsize).zero_()) # hebbian trace of recurrent weights\n                    #(weight.new(bsz, self.rnns[l].isize, self.rnns[l].hsize).zero_())  # hebbian trace for the input weights (not necessarily used)\n                    #(weight.new(bsz, self.rnns[l].w.shape[0], self.rnns[l].w.shape[1]).zero_()), # hebbian trace for the recurrent weights\n                    #(weight.new(bsz, self.rnns[l].win.shape[0], self.rnns[l].win.shape[1]).zero_())  # hebbian trace for the input weights (not necessarily used)\n                    )\n                    for l in range(self.nlayers)]\n        elif self.rnn_type == 'SPLITLSTM':\n            return [(\n                    (weight.new(bsz, self.nhid if l != self.nlayers - 1 else (self.ninp if self.tie_weights else self.nhid)).zero_()),   # H state\n                    (weight.new(bsz, self.rnns[l].lsize ).zero_()),   # C state\n                    (weight.new(bsz, self.rnns[l].w.shape[0], self.rnns[l].w.shape[1]).zero_()),   # hebb\n                    (weight.new(bsz, self.rnns[l].win.shape[0], self.rnns[l].win.shape[1]).zero_())  # hebbin\n                    )\n                    for l in range(self.nlayers)]\n        elif self.rnn_type == 'LSTM' :\n            return [((weight.new(1, bsz, self.nhid if l != self.nlayers - 1 else (self.ninp if self.tie_weights else self.nhid)).zero_()),\n                    (weight.new(1, bsz, self.nhid if l != self.nlayers - 1 else (self.ninp if self.tie_weights else self.nhid)).zero_()))\n                    for l in range(self.nlayers)]\n        elif self.rnn_type == 'QRNN' or self.rnn_type == 'GRU':\n            return [(weight.new(1, bsz, self.nhid if l != self.nlayers - 1 else (self.ninp if self.tie_weights else self.nhid)).zero_())\n                    for l in range(self.nlayers)]\n"
  },
  {
    "path": "awd-lstm-lm/model.py.old",
    "content": "import torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\n\nfrom embed_regularize import embedded_dropout\nfrom locked_dropout import LockedDropout\nfrom weight_drop import WeightDrop\n\nclass RNNModel(nn.Module):\n    \"\"\"Container module with an encoder, a recurrent module, and a decoder.\"\"\"\n\n    def __init__(self, rnn_type, ntoken, ninp, nhid, nlayers, dropout=0.5, dropouth=0.5, dropouti=0.5, dropoute=0.1, wdrop=0, tie_weights=False):\n        super(RNNModel, self).__init__()\n        self.lockdrop = LockedDropout()\n        self.idrop = nn.Dropout(dropouti)\n        self.hdrop = nn.Dropout(dropouth)\n        self.drop = nn.Dropout(dropout)\n        self.encoder = nn.Embedding(ntoken, ninp)\n        assert rnn_type in ['LSTM', 'QRNN', 'GRU'], 'RNN type is not supported'\n        if rnn_type == 'LSTM':\n            self.rnns = [torch.nn.LSTM(ninp if l == 0 else nhid, nhid if l != nlayers - 1 else (ninp if tie_weights else nhid), 1, dropout=0) for l in range(nlayers)]\n            if wdrop:\n                self.rnns = [WeightDrop(rnn, ['weight_hh_l0'], dropout=wdrop) for rnn in self.rnns]\n        if rnn_type == 'GRU':\n            self.rnns = [torch.nn.GRU(ninp if l == 0 else nhid, nhid if l != nlayers - 1 else ninp, 1, dropout=0) for l in range(nlayers)]\n            if wdrop:\n                self.rnns = [WeightDrop(rnn, ['weight_hh_l0'], dropout=wdrop) for rnn in self.rnns]\n        elif rnn_type == 'QRNN':\n            from torchqrnn import QRNNLayer\n            self.rnns = [QRNNLayer(input_size=ninp if l == 0 else nhid, hidden_size=nhid if l != nlayers - 1 else (ninp if tie_weights else nhid), save_prev_x=True, zoneout=0, window=2 if l == 0 else 1, output_gate=True) for l in range(nlayers)]\n            for rnn in self.rnns:\n                rnn.linear = WeightDrop(rnn.linear, ['weight'], dropout=wdrop)\n        print(self.rnns)\n        self.rnns = torch.nn.ModuleList(self.rnns)\n        self.decoder = nn.Linear(nhid, ntoken)\n\n        # Optionally tie weights as in:\n        # \"Using the Output Embedding to Improve Language Models\" (Press & Wolf 2016)\n        # https://arxiv.org/abs/1608.05859\n        # and\n        # \"Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling\" (Inan et al. 2016)\n        # https://arxiv.org/abs/1611.01462\n        if tie_weights:\n            #if nhid != ninp:\n            #    raise ValueError('When using the tied flag, nhid must be equal to emsize')\n            self.decoder.weight = self.encoder.weight\n\n        self.init_weights()\n\n        self.rnn_type = rnn_type\n        self.ninp = ninp\n        self.nhid = nhid\n        self.nlayers = nlayers\n        self.dropout = dropout\n        self.dropouti = dropouti\n        self.dropouth = dropouth\n        self.dropoute = dropoute\n        self.tie_weights = tie_weights\n\n    def reset(self):\n        if self.rnn_type == 'QRNN': [r.reset() for r in self.rnns]\n\n    def init_weights(self):\n        initrange = 0.1\n        self.encoder.weight.data.uniform_(-initrange, initrange)\n        self.decoder.bias.data.fill_(0)\n        self.decoder.weight.data.uniform_(-initrange, initrange)\n\n    def forward(self, input, hidden, return_h=False):\n        emb = embedded_dropout(self.encoder, input, dropout=self.dropoute if self.training else 0)\n        #emb = self.idrop(emb)\n\n        emb = self.lockdrop(emb, self.dropouti)\n\n        raw_output = emb\n        new_hidden = []\n        #raw_output, hidden = self.rnn(emb, hidden)\n        raw_outputs = []\n        outputs = []\n        for l, rnn in enumerate(self.rnns):\n            current_input = raw_output\n            raw_output, new_h = rnn(raw_output, hidden[l])\n            new_hidden.append(new_h)\n            raw_outputs.append(raw_output)\n            if l != self.nlayers - 1:\n                #self.hdrop(raw_output)\n                raw_output = self.lockdrop(raw_output, self.dropouth)\n                outputs.append(raw_output)\n        hidden = new_hidden\n\n        output = self.lockdrop(raw_output, self.dropout)\n        outputs.append(output)\n\n        result = output.view(output.size(0)*output.size(1), output.size(2))\n        if return_h:\n            return result, hidden, raw_outputs, outputs\n        return result, hidden\n\n    def init_hidden(self, bsz):\n        weight = next(self.parameters()).data\n        if self.rnn_type == 'LSTM':\n            return [(Variable(weight.new(1, bsz, self.nhid if l != self.nlayers - 1 else (self.ninp if self.tie_weights else self.nhid)).zero_()),\n                    Variable(weight.new(1, bsz, self.nhid if l != self.nlayers - 1 else (self.ninp if self.tie_weights else self.nhid)).zero_()))\n                    for l in range(self.nlayers)]\n        elif self.rnn_type == 'QRNN' or self.rnn_type == 'GRU':\n            return [Variable(weight.new(1, bsz, self.nhid if l != self.nlayers - 1 else (self.ninp if self.tie_weights else self.nhid)).zero_())\n                    for l in range(self.nlayers)]\n"
  },
  {
    "path": "awd-lstm-lm/mylstm.py",
    "content": "# Plastic LSTMs, with neuromodulation (backpropamine), \n# as described in Miconi et al. ICLR 2019,\n# by Thomas Miconi and Aditya Rawal.\n# Copyright (c) 2018-2019 Uber Technologies, Inc.\n#\n# Licensed under the Uber Non-Commercial License (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at the root directory of this project. \n\n\n\nimport torch\nfrom torch import nn\nfrom torch.autograd import Variable\nimport torch.nn.functional as F\nimport numpy as np\n\n\nimport pdb\n\n\n\n# SimplePlasticLSTM is a full-fledged implementation of Plastic LSTMs that uses\n# default settings and is not parametrizable beyond input size and hidden size.\n# This allows for simpler code and easier understanding. See \"PlasticLSTM\"\n# below for a more customizable version.\n\nclass SimplePlasticLSTM(nn.Module):             \n    def __init__(self, isize, hsize, params):   # Note that 'params' is ignored for this class; we keep it to preserve the constructor's signature\n        super(SimplePlasticLSTM, self).__init__()\n        self.softmax= torch.nn.functional.softmax\n        self.activ = F.tanh\n\n        # Plastic connection trainable parameters, i.e. w and alpha:\n        self.w =  torch.nn.Parameter(.02 * torch.rand(hsize, hsize) - .01)\n        self.alpha = torch.nn.Parameter(.0001 * torch.rand(1,1,hsize))      # One alpha per neuron (all incoming connections to a neuron share same alpha)\n        #self.alpha = torch.nn.Parameter(.0001 * torch.ones(1))             # One alpha for the whole network\n        #self.alpha =  torch.nn.Parameter(.0001 * torch.rand(hsize, hsize)) # One alpha per connection\n        \n        self.h2f = torch.nn.Linear(hsize, hsize)\n        self.h2i = torch.nn.Linear(hsize, hsize)\n        self.h2opt = torch.nn.Linear(hsize, hsize)\n        #self.h2c = torch.nn.Linear(hsize, hsize)  # This (equivalent to Whg in PyTorch LSTM docs / Uc in Wikipedia description of LSTM) is replaced by the plastic connection\n        self.x2f = torch.nn.Linear(isize, hsize)\n        self.x2opt = torch.nn.Linear(isize, hsize)\n        self.x2i = torch.nn.Linear(isize, hsize)\n        self.x2c = torch.nn.Linear(isize, hsize)  \n       \n        # Modulator output (M(t))\n        self.h2mod = torch.nn.Linear(hsize, 1)      # Takes input from the h-state, computes the neuromodulator output\n        self.modfanout = torch.nn.Linear(1, hsize)  # Projects the network's common neuromodulator output onto each neuron\n        \n        self.isize = isize\n        self.hsize = hsize\n\n\n    def forward(self, inputs, hidden): #, hebb, et, pw):  # hidden is a tuple of h, c and hebb\n        hebb = hidden[2]\n        fgt = F.sigmoid(self.x2f(inputs) + self.h2f(hidden[0]))\n        ipt = F.sigmoid(self.x2i(inputs) + self.h2i(hidden[0]))\n        opt = F.sigmoid(self.x2opt(inputs) + self.h2opt(hidden[0]))\n        \n        # To implement plasticity, we replace h2c / Whg / Uc with a plastic connection composed of w, alpha and hebb\n        # Note that h2c / Whg / Uc is the matrix of weights that takes in the\n        # previous time-step h, and whose output (after adding the current input \n        # and passing through tanh) is multiplied by the input gates before being \n        # added to the cell state\n        # Note: Each *column* in w, hebb and alpha constitutes the inputs to a single cell\n        # For w and alpha, columns are 2nd dimension (i.e. dim 1); for hebb, it's dimension 2 (dimension 0 is batch)\n        \n        # This is probably not the most elegant way to do it, but it works (remember that there is one alpha per neuron, applied to all input connections of this neuron)\n        h2coutput = hidden[0].unsqueeze(1).bmm(self.w + torch.mul(self.alpha, hebb)).squeeze(1)  \n\n        x2coutput = self.x2c(inputs)\n        inputstocell =  F.tanh(self.x2c(inputs) + h2coutput)  #  We compute this intermediary state to be used in Hebbian computations below\n        \n        # Finally, compute the new cell and hidden states\n        cell = torch.mul(fgt, hidden[1]) + torch.mul(ipt, inputstocell) \n        hactiv = torch.mul(opt, F.tanh(cell))\n        \n        # Now we need to update the Hebbian traces, including any neuromodulation.\n\n        deltahebb = torch.bmm(hidden[0].unsqueeze(2), inputstocell.unsqueeze(1))\n        myeta = F.tanh(self.h2mod(hactiv)).unsqueeze(2)  # Shape: BatchSize x 1 x 1\n        \n        # The output of the following line has shape BatchSize x 1 x NHidden, i.e. 1 line and NHidden columns for each \n        # batch element. \n        # When multiplying by deltahebb (BatchSize x NHidden x NHidden), broadcasting will provide a different\n        # value for each column but the same value for all rows within each column. This is equivalent to providing\n        # the same neuromodulation to all the inputs to a given cell, while letting neuromodulation differ from \n        # cell to cell, as required for the fanout concept.\n        \n        myeta = self.modfanout(myeta).squeeze().unsqueeze(1)              \n\n        hebb = torch.clamp(hebb + myeta * deltahebb, min=-2.0, max=2.0)\n\n        # Note that \"hactiv\" (i.e. the new h-state) is duplicated in the return\n        # values. This is to maintain the signature used by main.py/model.py (which is from Merity et al.'s code)\n        # and is not necessary for other applications.\n\n        hidden = (hactiv, cell, hebb) \n        activout = hactiv \n\n        return activout, hidden \n\n\n\n# A more customizable version of plastic LSTMs, using parameters passed in the 'params' argument.\n\nclass PlasticLSTM(nn.Module):\n    def __init__(self, isize, hsize, params):\n        super(PlasticLSTM, self).__init__()\n        self.softmax= torch.nn.functional.softmax\n        #if params['activ'] == 'tanh':\n        self.activ = F.tanh\n    \n        # Default values for configuration parameters:\n        self.cliptype, self.modultype, self.hebboutput, self.modulout, self.clipval, self.alphatype = 'clip', 'modplasth2mod', 'i2c', 'fanout', '2.0', 'perneuron'\n\n        # Description of the parameters:\n\n        # alphatype: do we have one alpha coefficient for each connection\n        # ('full'), one per neuron ('perneuron' - i.e. all input connections to\n        # a given neuron share the same alpha), or one for the entire network\n        # ('single')?\n\n        # modultype: 'none' (non-modulated plasticity) , 'modplasth2mod'\n        # (neuromodulation takes input from the current h-state) or\n        # 'modplastc2mod' (neuromodulation takes input from the currrent\n        # c-state).\n\n        # cliptype: 'clip', 'aditya' or 'decay' - specifies how the Hebbian traces should be constrained.\n\n        # clipval: maximum magnitude of the Hebbian trace values  (default 2.0)\n\n        # modulout: 'single' (all connections receive the same neuromodulator\n        # output) or 'fanout' (neuromodulator input goes through a 1xN linear layer to reach each neuron)\n\n        # hebboutput: what counts as the \"output\" in the Hebbian product of input by output. Better to leave it at 'i2c'.\n\n        if 'cliptype' in params:\n            self.cliptype = params['cliptype']\n        if 'modultype' in params:\n            self.modultype = params['modultype']\n        if 'hebboutput' in params:\n            self.hebboutput = params['hebboutput']\n        if 'modulout' in params:\n            self.modulout= params['modulout']\n        if 'clipval' in params:\n            self.clipval= params['clipval']\n        if 'alphatype' in params:\n            self.alphatype= params['alphatype']\n\n        # Plastic connection trainable parameters, i.e. w and alpha:\n        self.w =  torch.nn.Parameter(.02 * torch.rand(hsize, hsize) - .01)\n        if self.alphatype == 'perneuron':\n            self.alpha = torch.nn.Parameter(.0001 * torch.rand(1,1,hsize))\n        elif self.alphatype == 'single':\n            self.alpha = torch.nn.Parameter(.0001 * torch.ones(1))\n        elif self.alphatype == 'full':\n            self.alpha =  torch.nn.Parameter(.0001 * torch.rand(hsize, hsize))\n        else:\n            raise ValueError(\"Must select appropriate alpha type (current incorrect value is:\", str(self.alphatype), \")\")\n        if self.modultype == 'none':\n            self.eta = torch.nn.Parameter(.01 * torch.ones(1))  # Everyone has the same eta (Note: if a parameter is not actually used, there can be problems with ASGD handling in main.py) \n        \n        self.h2f = torch.nn.Linear(hsize, hsize)\n        self.h2i = torch.nn.Linear(hsize, hsize)\n        self.h2opt = torch.nn.Linear(hsize, hsize)\n        #self.h2c = torch.nn.Linear(hsize, hsize)  # This (equivalent to Whg in PyTorch LSTM docs / Uc in Wikipedia description of LSTM) is replaced by the plastic connection\n        self.x2f = torch.nn.Linear(isize, hsize)\n        self.x2opt = torch.nn.Linear(isize, hsize)\n        self.x2i = torch.nn.Linear(isize, hsize)\n        self.x2c = torch.nn.Linear(isize, hsize)  \n       \n        if self.modultype != 'none':\n            # This is the layer that computes the neuromodulator output at any time step, based on current hidden state.\n            # Although called 'h2mod', it may take input from h or c depending on modultype value\n            self.h2mod = torch.nn.Linear(hsize, 1)  \n            # Is the modulation just a single scalar, or do we pass it through a 'fanout' weight matrix to get one different value for each target neuron?\n            if self.modulout == 'fanout':\n                self.modfanout = torch.nn.Linear(1, hsize)  \n        \n        self.isize = isize\n        self.hsize = hsize\n\n\n    def forward(self, inputs, hidden): #, hebb, et, pw):  # hidden is a tuple of h, c and hebb\n        hebb = hidden[2]\n        fgt = F.sigmoid(self.x2f(inputs) + self.h2f(hidden[0]))\n        ipt = F.sigmoid(self.x2i(inputs) + self.h2i(hidden[0]))\n        opt = F.sigmoid(self.x2opt(inputs) + self.h2opt(hidden[0]))\n        \n        # To implement plasticity, we replace h2c / Whg / Uc with a plastic connection composed of w, alpha and hebb\n        # Note that h2c / Whg / Uc is the matrix of weights that takes in the\n        # previous time-step h, and whose output (after adding the current input \n        # and passing through tanh) is multiplied by the input gates before being \n        # added to the cell state\n        # Note: Each *column* in w, hebb and alpha constitutes the inputs to a single cell\n        # For w and alpha, columns are 2nd dimension (i.e. dim 1); for hebb, it's dimension 2 (dimension 0 is batch)\n        if self.cliptype == 'aditya':   # Clipping Hebbian traces a posteriori\n            h2coutput = hidden[0].unsqueeze(1).bmm(self.w + torch.mul(self.alpha, torch.clamp(hebb, min=-self.clipval, max=self.clipval))).squeeze(1)  \n        else:\n            h2coutput = hidden[0].unsqueeze(1).bmm(self.w + torch.mul(self.alpha, hebb)).squeeze(1)  \n\n        x2coutput = self.x2c(inputs)\n        inputstocell =  F.tanh(self.x2c(inputs) + h2coutput)  #  We compute this intermediary state to be used in Hebbian computations below\n        \n        # Finally, compute the new cell and hidden states\n        cell = torch.mul(fgt, hidden[1]) + torch.mul(ipt, inputstocell) \n        hactiv = torch.mul(opt, F.tanh(cell))\n        \n        # Now we need to compute the updates to the Hebbian traces, including any neuromodulation.\n\n        # For the Hebbian computation, what counts as \"output\"?\n        if self.hebboutput == 'i2c':\n            deltahebb = torch.bmm(hidden[0].unsqueeze(2), inputstocell.unsqueeze(1))\n        elif self.hebboutput == 'h2co': \n            deltahebb = torch.bmm(hidden[0].unsqueeze(2), h2coutput.unsqueeze(1))\n        elif self.hebboutput == 'cell': \n            deltahebb = torch.bmm(hidden[0].unsqueeze(2), cell.unsqueeze(1))\n        elif self.hebboutput == 'hidden': \n            deltahebb = torch.bmm(hidden[0].unsqueeze(2), hactiv.unsqueeze(1)) \n        else: \n            raise ValueError(\"Must choose Hebbian target output\")\n\n        # What is the source of the neuromodulator computation (if any)?\n        if self.modultype == 'none':\n            myeta = self.eta\n        elif self.modultype == 'modplasth2mod': # The neuromodulation takes input from the h-state\n            myeta = F.tanh(self.h2mod(hactiv)).unsqueeze(2)  # Shape: BatchSize x 1 x 1\n        elif self.modultype == 'modplastc2mod': # The neuromodulation takes input from the c-state\n            myeta = F.tanh(self.h2mod(cell)).unsqueeze(2)\n        else: \n            raise ValueError(\"Must choose modulation type\")\n        \n\n        # If we use \"fanout\" neuromodulation, the neuromodulator output is passed through a (trainable) linear layer before hitting the neurons. \n        if self.modultype != 'none' and self.modulout == 'fanout':\n            # The output of the following line has shape BatchSize x 1 x NHidden, i.e. 1 line and NHidden columns for each \n            # batch element. \n            # When multiplying by deltahebb (BatchSize x NHidden x NHidden), broadcasting will provide a different\n            # value for each column but the same value for all rows within each column. This is equivalent to providing\n            # the same neuromodulation to all the inputs to a given cell, while letting neuromodulation differ from \n            # cell to cell, as required for the fanout concept.\n            \n            myeta = self.modfanout(myeta).squeeze().unsqueeze(1)              \n\n        # Various possible ways to clip the Hebbian trace \n        if self.cliptype == 'decay':    # Exponential decay\n            hebb = (1 - myeta) * hebb + myeta * deltahebb\n        elif self.cliptype == 'clip':   # Just a hard clip\n            hebb = torch.clamp(hebb + myeta * deltahebb, min=-self.clipval, max=self.clipval)\n        elif self.cliptype == 'aditya': # For this one, the clipping only occurs a posteriori (see above); hebb itself can grow arbitrarily\n            hebb = hebb + myeta * deltahebb   \n        else: \n            raise ValueError(\"Must choose clip type\")\n\n\n        # Note that \"hactiv\" (i.e. the new h-state) is duplicated in the return\n        # values. This is to maintain the signature used by main.py/model.py\n        # and is not necessary for other applications.\n\n        hidden = (hactiv, cell, hebb) \n        activout = hactiv \n\n        return activout, hidden \n\n\n\n# This is a slightly faster implementation of Plastic Lstms: cut time by ~30%  by grouping all matrix multiplications into two. Not fully debugged, use at own risk.\nclass MyFastPlasticLSTM(nn.Module):\n    def __init__(self, isize, hsize, params):\n        super(MyFastPlasticLSTM, self).__init__()\n        self.softmax= torch.nn.functional.softmax\n        self.activ = F.tanh\n\n        ok=0\n        if 'cliptype' in params:\n            self.cliptype = params['cliptype']\n            ok+=1\n        if 'modultype' in params:\n            self.modultype = params['modultype']\n            ok+=1\n        if 'hebboutput' in params:\n            self.hebboutput = params['hebboutput']\n            ok+=1\n        if 'modulout' in params:\n            self.modulout= params['modulout']\n            ok+=1\n        if 'clipval' in params:\n            self.clipval= params['clipval']\n            ok+=1\n        if 'alphatype' in params:\n            self.alphatype= params['alphatype']\n            ok+=1\n        if ok < 6:\n            raise ValueError('When constructing PlasticLSTM, must pass \"params\" dictionary including cliptype, clipval, modultype, modulout, alphatype and hebboutput')\n\n        # We group all weight matrices into two, just like the C implementation of LSTMs in PyTorch does. Faster!\n        # Note: this creates some redundant biases (though not many)\n        self.h2f_i_opt_c = torch.nn.Linear(hsize, 4*hsize) # Weights from h to f, i, o and c\n        self.x2f_i_opt_c = torch.nn.Linear(isize, 4*hsize) # Weights from x to f, i, o and c\n        self.isize = isize\n        self.hsize = hsize\n        \n        if self.modultype != 'none':\n            self.h2mod = torch.nn.Linear(hsize, 1)  # Although called 'h2mod', it may take input from h or c depending on modultype value\n            if self.modulout == 'fanout':\n                self.modfanout = torch.nn.Linear(1, hsize)  \n        \n        if self.alphatype == 'perneuron':\n            self.alpha = torch.nn.Parameter(.0001 * torch.rand(1,1,hsize))\n            #self.alpha = Variable(.0001 * torch.ones(1).cuda(), requires_grad=True) #torch.rand(1,1,hsize))\n        elif self.alphatype == 'single':\n            self.alpha = torch.nn.Parameter(.0001 * torch.ones(1))\n        elif self.alphatype == 'full':\n            self.alpha =  torch.nn.Parameter(.0001 * torch.rand(hsize, hsize))\n        else:\n            raise ValueError(\"Must select alpha type (current incorrect value is:\", str(self.alphatype), \")\")\n        if self.modultype == 'none':\n            self.eta = torch.nn.Parameter(.01 * torch.ones(1))  # Everyone has the same eta (Note: if a parameter is not actually used, there can be problems with ASGD handling in main.py) \n\n\n\n    def forward(self, inputs, hidden): #, hebb, et, pw):  # hidden is a tuple of h and c states\n            \n        hsize = self.hsize\n        #fgt = F.sigmoid(self.x2f(inputs) + self.h2f(hidden[0]))\n        #ipt = F.sigmoid(self.x2i(inputs) + self.h2i(hidden[0])) \n        #opt = F.sigmoid(self.x2opt(inputs) + self.h2opt(hidden[0])) \n        alloutputs = self.x2f_i_opt_c(inputs) + self.h2f_i_opt_c(hidden[0])\n        \n        # hidden[0] and hidden[1] are the h state and the c state; hidden[2] is the hebbian trace\n        hebb = hidden[2]\n\n        fgt = F.sigmoid(alloutputs[:,:hsize])\n        ipt = F.sigmoid(alloutputs[:,hsize:2*hsize])\n        opt = F.sigmoid(alloutputs[:,2*hsize:3*hsize])\n        handx2coutput_w = alloutputs[:,3*hsize:]\n        if self.cliptype == 'aditya':\n            h2coutput_hebb = hidden[0].unsqueeze(1).bmm(torch.mul(self.alpha, self.clipval * torch.tanh(hebb))).squeeze(1)  # Slightly different version\n        else:\n            h2coutput_hebb = hidden[0].unsqueeze(1).bmm(torch.mul(self.alpha, hebb)).squeeze(1)  \n        inputtoc = F.tanh(handx2coutput_w + h2coutput_hebb)\n        \n            # Each *column* in w, hebb and alpha constitutes the inputs to a single cell\n            # For w and alpha, columns are 2nd dimension (i.e. dim 1); for hebb, it's dimension 2 (dimension 0 is batch)\n        \n        cell = torch.mul(fgt, hidden[1]) + torch.mul(ipt, inputtoc)\n        hactiv = torch.mul(opt, F.tanh(cell))\n\n\n        #if self.hebboutput == 'i2c':\n        deltahebb = torch.bmm(hidden[0].unsqueeze(2), inputtoc.unsqueeze(1))\n        if self.modultype == 'none':\n            myeta = self.eta\n        elif self.modultype == 'modplasth2mod':\n            myeta = F.tanh(self.h2mod(hactiv)).unsqueeze(2)  # Shape: BatchSize x 1 x 1\n        elif self.modultype == 'modplastc2mod':\n            myeta = F.tanh(self.h2mod(cell)).unsqueeze(2)\n        else: \n            raise ValueError(\"Must choose modulation type\")\n        \n        if self.modultype != 'none' and self.modulout == 'fanout':\n            # Each *column* in w, hebb and alpha constitutes the inputs to a single cell\n            # For w and alpha, columns are 2nd dimension (i.e. dim 1); for hebb, it's dimension 2 (dimension 0 is batch)\n            # The output of the following line has shape BatchSize x 1 x NHidden, i.e. 1 line and NHidden columns for each \n            # batch element. When multiplying by hebb (BatchSize x NHidden x NHidden), broadcasting will provide a different\n            # value of myeta for each cell but the same value for all inputs of a cell, as required by fanout concept.\n             myeta = self.modfanout(myeta).squeeze().unsqueeze(1)              \n\n        if self.cliptype == 'decay':\n            hebb = (1 - myeta) * hebb + myeta * deltahebb\n        elif self.cliptype == 'clip':\n            hebb = torch.clamp(hebb + myeta * deltahebb, min=-self.clipval, max=self.clipval)\n        elif self.cliptype == 'aditya' :\n            hebb = hebb + myeta * deltahebb   \n        else: \n            raise ValueError(\"Must choose clip type\")\n\n        hidden = (hactiv, cell, hebb)\n        activout = hactiv \n        \n\n\n        return activout, hidden #, hebb, et, pw\n\n\n\n\n\n\n\n# Standard, non-plastic LSTM, reimplemented \"by hand\" to check if our\n# implementation is correct, and to ensure that our comparisons use the closest\n# possible non-plastic equivalent to our plastic LSTMs. Gets almost identical\n# results to the PyTorch internal LSTM used by the original smerity code.\n\nclass MyLSTM(nn.Module):\n    def __init__(self, isize, hsize):\n        super(MyLSTM, self).__init__()\n        self.softmax= torch.nn.functional.softmax\n        #if params['activ'] == 'tanh':\n        self.activ = F.tanh\n        self.h2f = torch.nn.Linear(hsize, hsize)\n        self.h2i = torch.nn.Linear(hsize, hsize)\n        self.h2opt = torch.nn.Linear(hsize, hsize)\n        self.h2c = torch.nn.Linear(hsize, hsize)\n        self.x2f = torch.nn.Linear(isize, hsize)\n        self.x2opt = torch.nn.Linear(isize, hsize)\n        self.x2i = torch.nn.Linear(isize, hsize)\n        self.x2c = torch.nn.Linear(isize, hsize)\n        self.isize = isize\n        self.hsize = hsize\n\n\n\n    def forward(self, inputs, hidden): #, hebb, et, pw):  # hidden is a tuple of h and c states\n            \n        fgt = F.sigmoid(self.x2f(inputs) + self.h2f(hidden[0]))\n        ipt = F.sigmoid(self.x2i(inputs) + self.h2i(hidden[0]))\n        opt = F.sigmoid(self.x2opt(inputs) + self.h2opt(hidden[0]))\n        cell = torch.mul(fgt, hidden[1]) + torch.mul(ipt, F.tanh(self.x2c(inputs) + self.h2c(hidden[0])))\n        hactiv = torch.mul(opt, F.tanh(cell))\n        #pdb.set_trace()\n        hidden = (hactiv, cell)\n        activout = hactiv #self.h2o(hactiv)\n        #pdb.set_trace()\n        return activout, hidden #, hebb, et, pw\n\n\n\n\n# Faster MyLSTM - by ~30% in comparison to MyLSTM, by grouping matrices and matrix multiplications. Not fully debugged, use at own risk.\nclass MyFastLSTM(nn.Module):\n    def __init__(self, isize, hsize):\n        super(MyFastLSTM, self).__init__()\n        self.softmax= torch.nn.functional.softmax\n        #if params['activ'] == 'tanh':\n        self.activ = F.tanh\n        # We group all weight matrices into two, just like the C implementation of LSTMs in PyTorch does\n        # Note: this creates some redundant biases (though not many)\n        self.h2f_i_opt_c = torch.nn.Linear(hsize, 4*hsize) # Weights from h to f, i, o and c\n        self.x2f_i_opt_c = torch.nn.Linear(isize, 4*hsize) # Weights from x to f, i, o and c\n        self.isize = isize\n        self.hsize = hsize\n\n\n\n    def forward(self, inputs, hidden): #, hebb, et, pw):  # hidden is a tuple of h and c states\n            \n        #fgt = F.sigmoid(self.x2f(inputs) + self.h2f(hidden[0])) #\n        #ipt = F.sigmoid(self.x2i(inputs) + self.h2i(hidden[0])) #\n        #opt = F.sigmoid(self.x2opt(inputs) + self.h2opt(hidden[0])) #\n        alloutputs = self.x2f_i_opt_c(inputs) + self.h2f_i_opt_c(hidden[0])\n        \n        hsize = self.hsize\n        # You can gain ~ 5% in speed by grouping these three :\n        fgt = F.sigmoid(alloutputs[:,:hsize])\n        ipt = F.sigmoid(alloutputs[:,hsize:2*hsize])\n        opt = F.sigmoid(alloutputs[:,2*hsize:3*hsize])\n        inputtoc = F.tanh(alloutputs[:,3*hsize:])\n        #cell = torch.mul(fgt, hidden[1]) + torch.mul(ipt, F.tanh(self.x2c(inputs) + self.h2c(hidden[0])))#\n        cell = torch.mul(fgt, hidden[1]) + torch.mul(ipt, inputtoc)\n        hactiv = torch.mul(opt, F.tanh(cell))\n        hidden = (hactiv, cell)\n        activout = hactiv \n        #pdb.set_trace()\n        return activout, hidden #, hebb, et, pw\n\n\n"
  },
  {
    "path": "awd-lstm-lm/mylstm.py.orig",
    "content": "import torch\nfrom torch import nn\nfrom torch.autograd import Variable\nimport torch.nn.functional as F\nimport numpy as np\n\n\nimport pdb\n\n\nclass PlasticLSTM(nn.Module):\n    def __init__(self, isize, hsize, params):\n        super(PlasticLSTM, self).__init__()\n        self.softmax= torch.nn.functional.softmax\n        #if params['activ'] == 'tanh':\n        self.activ = F.tanh\n\n        ok=0\n        if 'cliptype' in params:\n            self.cliptype = params['cliptype']\n            ok+=1\n        if 'modultype' in params:\n            self.modultype = params['modultype']\n            ok+=1\n        if 'hebboutput' in params:\n            self.hebboutput = params['hebboutput']\n            ok+=1\n        if 'modulout' in params:\n            self.modulout= params['modulout']\n            ok+=1\n        if 'alphatype' in params:\n            self.alphatype= params['alphatype']\n            ok+=1\n        if ok < 5:\n            raise ValueError('When using PlasticLSTM, must specify cliptype, modultype, modulout, alphatype and hebboutput in params')\n\n        # Plastic connection parameters:\n        self.w =  torch.nn.Parameter(.02 * torch.rand(hsize, hsize) - .01)\n        if self.alphatype == 'perneuron':\n            self.alpha = torch.nn.Parameter(.0001 * torch.rand(1,1,hsize))\n            #self.alpha = Variable(.0001 * torch.ones(1).cuda(), requires_grad=True) #torch.rand(1,1,hsize))\n        elif self.alphatype == 'full':\n            self.alpha =  torch.nn.Parameter(.0001 * torch.rand(hsize, hsize))\n        else:\n            raise ValueError(\"Must select alpha type (current incorrect value is:\", str(self.alphatype), \")\")\n        if self.modultype == 'none':\n            self.eta = torch.nn.Parameter(.01 * torch.ones(1))  # Everyone has the same eta (Note: if a parameter is not actually used, there can be problems with ASGD handling in main.py) \n        #self.eta = .01\n        \n        self.h2f = torch.nn.Linear(hsize, hsize)\n        self.h2i = torch.nn.Linear(hsize, hsize)\n        self.h2opt = torch.nn.Linear(hsize, hsize)\n        #self.h2c = torch.nn.Linear(hsize, hsize)  # This (equivalent to Whg in the PyTorch docs, Uc in Wikipedia) is replaced by the plastic connection\n        self.x2f = torch.nn.Linear(isize, hsize)\n        self.x2opt = torch.nn.Linear(isize, hsize)\n        self.x2i = torch.nn.Linear(isize, hsize)\n        self.x2c = torch.nn.Linear(isize, hsize)\n       \n        # Is the modulation just a single scalar, or do we pass it through a 'fanout' weight matrix?\n        if self.modultype != 'none':\n            self.h2mod = torch.nn.Linear(hsize, 1)  # Although called 'h2mod', it may take input from h or c depending on modultype value\n            if self.modulout == 'fanout':\n                self.modfanout = torch.nn.Linear(1, hsize)  \n        \n        self.isize = isize\n        self.hsize = hsize\n\n\n    def forward(self, inputs, hidden): #, hebb, et, pw):  # hidden is a tuple of h, c and hebb\n        \n        hebb = hidden[2]\n        fgt = F.sigmoid(self.x2f(inputs) + self.h2f(hidden[0]))\n        ipt = F.sigmoid(self.x2i(inputs) + self.h2i(hidden[0]))\n        opt = F.sigmoid(self.x2opt(inputs) + self.h2opt(hidden[0]))\n        #cell = torch.mul(fgt, hidden[1]) + torch.mul(ipt, F.tanh(self.x2c(inputs) + self.h2c(hidden[0])))\n        \n        # To implement plasticity, we replace h2c / Whg / Uc with a plastic connection composed of w, alpha and hebb\n        # Note that h2c / Whg / Uc is the matrix of weights that takes in the\n        # previous time-step h, and whose output (after adding the current input \n        # and passing through tanh) is multiplied by the input gates before being \n        # added to the cell state\n        if self.cliptype == 'aditya':\n            # Each *column* in w, hebb and alpha constitutes the inputs to a single cell\n            # For w and alpha, columns are 2nd dimension (i.e. dim 1); for hebb, it's dimension 2 (dimension 0 is batch)\n            h2coutput = hidden[0].unsqueeze(1).bmm(self.w + torch.mul(self.alpha, torch.clamp(hebb, min=-1.0, max=1.0))).squeeze()  \n        else:\n            h2coutput = hidden[0].unsqueeze(1).bmm(self.w + torch.mul(self.alpha, hebb)).squeeze()  \n            #if np.random.rand() < .1:\n            #    pdb.set_trace()\n        inputstocell =  F.tanh(self.x2c(inputs) + h2coutput)\n        #inputstocell =  F.tanh(self.x2c(inputs) + torch.matmul(hidden[0].unsqueeze(1), self.w.unsqueeze(0)).squeeze(1)) \n        cell = torch.mul(fgt, hidden[1]) + torch.mul(ipt, inputstocell) #  self.h2c(hidden[0])))\n\n        \n        #pdb.set_trace()\n        \n        hactiv = torch.mul(opt, F.tanh(cell))\n        #pdb.set_trace()\n        \n        # For the Hebbian computation, what counts as \"output\"?\n        if self.hebboutput == 'i2c':\n            deltahebb = torch.bmm(hidden[0].unsqueeze(2), inputstocell.unsqueeze(1))\n        elif self.hebboutput == 'h2co': \n            deltahebb = torch.bmm(hidden[0].unsqueeze(2), h2coutput.unsqueeze(1))\n        elif self.hebboutput == 'cell': \n            deltahebb = torch.bmm(hidden[0].unsqueeze(2), cell.unsqueeze(1))\n        elif self.hebboutput == 'hidden': \n            deltahebb = torch.bmm(hidden[0].unsqueeze(2), hactiv.unsqueeze(1)) \n        else: \n            raise ValueError(\"Must choose Hebbian target output\")\n\n        # What is the source of the neuromodulator computation (if any)?\n        if self.modultype == 'none':\n            myeta = self.eta\n        elif self.modultype == 'modplasth2mod':\n            myeta = F.tanh(self.h2mod(hactiv)).unsqueeze(2)  # Shape: BatchSize x 1 x 1\n        elif self.modultype == 'modplastc2mod':\n            myeta = F.tanh(self.h2mod(cell)).unsqueeze(2)\n        else: \n            raise ValueError(\"Must choose modulation type\")\n        \n        #pdb.set_trace()\n        if self.modultype != 'none' and self.modulout == 'fanout':\n            # Each *column* in w, hebb and alpha constitutes the inputs to a single cell\n            # For w and alpha, columns are 2nd dimension (i.e. dim 1); for hebb, it's dimension 2 (dimension 0 is batch)\n            # The output of the following line has shape BatchSize x 1 x NHidden, i.e. 1 line and NHidden columns for each \n            # batch element. When multiplying by hebb (BatchSize x NHidden x NHidden), broadcasting will provide a different\n            # value for each cell but the same value for all inputs of a cell, as required by fanout concept.\n             myeta = self.modfanout(myeta).squeeze().unsqueeze(1)              \n\n        if self.cliptype == 'decay':\n            hebb = (1 - myeta) * hebb + myeta * deltahebb\n        elif self.cliptype == 'clip':\n            hebb = torch.clamp(hebb + myeta * deltahebb, min=-1.0, max=1.0)\n        elif self.cliptype == 'aditya':\n            hebb = hebb + myeta * deltahebb   \n        else: \n            raise ValueError(\"Must choose clip type\")\n\n        hidden = (hactiv, cell, hebb)\n        activout = hactiv #self.h2o(hactiv)\n        #if np.isnan(np.sum(hactiv.data.cpu().numpy())) or np.isnan(np.sum(hidden[1].data.cpu().numpy())) :\n        #    raise ValueError(\"Nan detected !\")\n        #pdb.set_trace()\n\n        return activout, hidden #, hebb, et, pw\n\n\n\n\nclass MyLSTM(nn.Module):\n# Standard, non-plastic LSTM, reimplemented \"by hand\" to check if our\n# implementation is correct. Gets almost identical results to the PyTorch\n# internal LSTM used by the original smerity code.\n    def __init__(self, isize, hsize):\n        super(MyLSTM, self).__init__()\n        self.softmax= torch.nn.functional.softmax\n        #if params['activ'] == 'tanh':\n        self.activ = F.tanh\n        self.h2f = torch.nn.Linear(hsize, hsize)\n        self.h2i = torch.nn.Linear(hsize, hsize)\n        self.h2opt = torch.nn.Linear(hsize, hsize)\n        self.h2c = torch.nn.Linear(hsize, hsize)\n        self.x2f = torch.nn.Linear(isize, hsize)\n        self.x2opt = torch.nn.Linear(isize, hsize)\n        self.x2i = torch.nn.Linear(isize, hsize)\n        self.x2c = torch.nn.Linear(isize, hsize)\n        self.isize = isize\n        self.hsize = hsize\n\n\n    def forward(self, inputs, hidden): #, hebb, et, pw):  # hidden is a tuple of h and c states\n            \n        fgt = F.sigmoid(self.x2f(inputs) + self.h2f(hidden[0]))\n        ipt = F.sigmoid(self.x2i(inputs) + self.h2i(hidden[0]))\n        opt = F.sigmoid(self.x2opt(inputs) + self.h2opt(hidden[0]))\n        cell = torch.mul(fgt, hidden[1]) + torch.mul(ipt, F.tanh(self.x2c(inputs) + self.h2c(hidden[0])))\n        hactiv = torch.mul(opt, F.tanh(cell))\n        #pdb.set_trace()\n        hidden = (hactiv, cell)\n        activout = hactiv #self.h2o(hactiv)\n        #if np.isnan(np.sum(hactiv.data.cpu().numpy())) or np.isnan(np.sum(hidden[1].data.cpu().numpy())) :\n        #    raise ValueError(\"Nan detected !\")\n\n        #pdb.set_trace()\n\n        return activout, hidden #, hebb, et, pw\n\n\n\n"
  },
  {
    "path": "awd-lstm-lm/opus.docker.old",
    "content": "#tmiconi_rl\n#latest\n#.\n\n\n#FROM localhost:5000/opus-deep-learning:master-test-2017_9_7_20_56_10\nFROM opus-deep-learning-py3:master-prod-2019_2_5_4_54_39\n#FROM opus-deep-learning:master--2018_9_20_18_2_31\n\n\n\n\nRUN mkdir /home/work\n\nCOPY ./*.py /home/work/\nCOPY ./*.sh /home/work/\nCOPY ./*.md /home/work/\n\nENV LC_ALL C.UTF-8\nENV  LANG C.UTF-8\n\n"
  },
  {
    "path": "awd-lstm-lm/plotresults.py",
    "content": "import numpy as np\nimport glob\nimport matplotlib.pyplot as plt\nimport scipy\nfrom scipy import stats\n\ncolorz = ['r', 'b', 'g', 'c', 'm', 'y', 'orange', 'k']\n\n\n\n\ngroupnames = glob.glob('./HDFS/ptb/results*seed0.txt')  \n#groupnames = glob.glob('./HDFS/ptbprevious/results*seed0.txt')  \n#groupnames = glob.glob('./HDFS/ptbold/results*.txt')  \n\n\n\n#groupnames = glob.glob('./tmp/loss_*new*eplen_250*rngseed_0.txt')  \n#groupnames = glob.glob('./tmp/loss_*new*.9_*rngseed_0.txt')  \n\n\n\n# If you can only use 7 runs, smooth the losses within each run to obtain more reliable estimates of performance!\n\n\ndef mavg(x, N=20):\n  return x\n  #cumsum = np.cumsum(np.insert(x, 0, 0)) \n  #return (cumsum[N:] - cumsum[:-N]) / N\n\nplt.ion()\n#plt.figure(figsize=(5,4))  # Smaller figure = relative larger fonts\nplt.figure()\n\nallmedianls = []\nalllosses = []\nposcol = 0\nminminlen = 999999\nfor numgroup, groupname in enumerate(groupnames):\n    if \"ults__\" not in groupname:\n        continue\n    g = groupname[:-6]+\"*\"\n    print(\"====\", groupname)\n    fnames = glob.glob(g)\n    fulllosses=[]\n    losses=[]\n    lgts=[]\n    for fn in fnames:\n        if \"COPY\" in fn:\n            continue\n        if False:\n            #if \"seed_3\" in fn:\n            #    continue\n            #if \"seed_7\" in fn:\n            #    continue\n            if \"seed_8\" in fn:\n                continue\n            if \"seed_9\" in fn:\n                continue\n            if \"seed_10\" in fn:\n                continue\n            if \"seed_11\" in fn:\n                continue\n            if \"seed_12\" in fn:\n                continue\n            if \"seed_13\" in fn:\n                continue\n            if \"seed_14\" in fn:\n                continue\n            if \"seed_15\" in fn:\n                continue\n        z = np.loadtxt(fn)\n        \n        #z = mavg(z, 10)  # For each run, we average the losses over K successive episodes\n\n        #z = z[::10] # Decimation - speed things up!\n        print(len(z))\n        #if len(z) < 100:\n        #    print(fn, len(z))\n        #    continue\n        #z = z[:90]\n        lgts.append(len(z))\n        fulllosses.append(z)\n    minlen = min(lgts)\n    if minlen < minminlen:\n        minminlen = minlen\n    print(minlen)\n    #if minlen < 1000:\n    #    continue\n    for z in fulllosses:\n        losses.append(z[:minlen])\n\n    losses = np.array(losses)\n    alllosses.append(losses)\n    \n    meanl = np.mean(losses, axis=0)\n    stdl = np.std(losses, axis=0)\n    cil = stdl / np.sqrt(losses.shape[0]) * 1.96  # 95% confidence interval - assuming normality\n    #cil = stdl / np.sqrt(losses.shape[0]) * 2.5  # 95% confidence interval - approximated with the t-distribution for 7 d.f.\n\n    medianl = np.median(losses, axis=0)\n    allmedianls.append(medianl)\n    q1l = np.percentile(losses, 25, axis=0)\n    q3l = np.percentile(losses, 75, axis=0)\n    \n    highl = np.max(losses, axis=0)\n    lowl = np.min(losses, axis=0)\n    #highl = meanl+stdl\n    #lowl = meanl-stdl\n\n    xx = range(len(meanl))\n\n    # xticks and labels\n    #xt = range(0, len(meanl), 1000)\n    xt = range(0, 10001, 2000)\n    xtl = [str(10 * 10 * i) for i in xt]   # Because of decimation above, and only every 10th loss is recorded in the files\n\n    #plt.plot(mavg(meanl, 100), label=g) #, color='blue')\n    #plt.fill_between(xx, lowl, highl,  alpha=.2)\n    #plt.fill_between(xx, q1l, q3l,  alpha=.1)\n    #plt.plot(meanl) #, color='blue')\n    ####plt.plot(mavg(medianl, 100), label=g) #, color='blue')  # mavg changes the number of points !\n    #plt.plot(mavg(q1l, 100), label=g, alpha=.3) #, color='blue')\n    #plt.plot(mavg(q3l, 100), label=g, alpha=.3) #, color='blue')\n    #plt.fill_between(xx, q1l, q3l,  alpha=.2)\n    #plt.plot(medianl, label=g) #, color='blue')\n   \n    AVGSIZE = 1\n    \n    xlen = len(mavg(q1l, AVGSIZE))\n    #mylabel = g[g.find('type'):]\n    mylabel = g\n    myls = '-'\n    if poscol >= len(colorz):\n        myls = \"--\"\n    plt.plot(mavg(medianl, AVGSIZE), label=mylabel, color=colorz[poscol % len(colorz)], ls=myls)  # mavg changes the number of points !\n    plt.fill_between( range(xlen), mavg(q1l, AVGSIZE), mavg(q3l, AVGSIZE),  alpha=.2, color=colorz[poscol % len(colorz)])\n    \n    #xlen = len(mavg(meanl, AVGSIZE))\n    #plt.plot(mavg(meanl, AVGSIZE), label=g, color=colorz[poscol % len(colorz)])  # mavg changes the number of points !\n    #plt.fill_between( range(xlen), mavg(meanl - cil, AVGSIZE), mavg(meanl + cil, AVGSIZE),  alpha=.2, color=colorz[poscol % len(colorz)])\n    \n    poscol += 1\n    \n    #plt.fill_between( range(xlen), mavg(lowl, 100), mavg(highl, 100),  alpha=.2, color=colorz[numgroup % len(colorz)])\n\n    #plt.plot(mavg(losses[0], 1000), label=g, color=colorz[numgroup % len(colorz)])\n    #for curve in losses[1:]:\n    #    plt.plot(mavg(curve, 1000), color=colorz[numgroup % len(colorz)])\n\nps = []\n# Adapt for varying lengths across groups\n#for n in range(0, alllosses[0].shape[1], 3):\n\n#for n in range(0, minminlen):\n#    ps.append(scipy.stats.ranksums(alllosses[0][:,n], alllosses[1][:,n]).pvalue)\n#ps = np.array(ps)\n\nplt.legend(loc='best', fontsize=12)\n#plt.xlabel('Loss (sum square diff. b/w final output and target)')\nplt.xlabel('Number of Episodes')\nplt.ylabel('Loss')\n#plt.xticks(xt, xtl)\n#plt.tight_layout()\n\n\n\n"
  },
  {
    "path": "awd-lstm-lm/plotresultssingle.py",
    "content": "import numpy as np\nimport matplotlib.pyplot as plt\nimport glob\n\n\nfns = glob.glob('./HDFS/ptb/results_*.txt')\n\nplt.figure()\n\nnumcurve = 0\nfor (ii, fn) in enumerate(fns):\n    #if 'B_' not in fn and 'MYLSTM' not in fn:\n    #    continue \n    if 'rngseed' in fn:\n        if 'seed0' not in fn:\n            continue\n    if 'agdiv10'  in fn:\n        continue\n    #if '44' not in fn:\n    #    continue\n    print(fn)\n    #if 'perneuron'  in fn:\n    #    continue\n    numcurve += 1\n    if numcurve > 20:\n        ls = ':'\n    elif numcurve > 10:\n        ls = '--'\n    else:\n        ls = '-'\n    #z = np.loadtxt(fn)\n    z = np.exp(np.loadtxt(fn))\n    plt.plot(z, label=fn,ls=ls)\n\nplt.legend(loc='upper right')\nplt.show()\n\n\n"
  },
  {
    "path": "awd-lstm-lm/pointer.py",
    "content": "import argparse\nimport time\nimport math\nimport numpy as np\nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\n\nimport data\nimport model\n\nfrom utils import batchify, get_batch, repackage_hidden\n\nparser = argparse.ArgumentParser(description='PyTorch PennTreeBank RNN/LSTM Language Model')\nparser.add_argument('--data', type=str, default='data/penn',\n                    help='location of the data corpus')\nparser.add_argument('--model', type=str, default='LSTM',\n                    help='type of recurrent net (LSTM, QRNN)')\nparser.add_argument('--save', type=str,default='best.pt',\n                    help='model to use the pointer over')\nparser.add_argument('--cuda', action='store_false',\n                    help='use CUDA')\nparser.add_argument('--bptt', type=int, default=5000,\n                    help='sequence length')\nparser.add_argument('--window', type=int, default=3785,\n                    help='pointer window length')\nparser.add_argument('--theta', type=float, default=0.6625523432485668,\n                    help='mix between uniform distribution and pointer softmax distribution over previous words')\nparser.add_argument('--lambdasm', type=float, default=0.12785920428335693,\n                    help='linear mix between only pointer (1) and only vocab (0) distribution')\nargs = parser.parse_args()\n\n###############################################################################\n# Load data\n###############################################################################\n\ncorpus = data.Corpus(args.data)\n\neval_batch_size = 1\ntest_batch_size = 1\n#train_data = batchify(corpus.train, args.batch_size)\nval_data = batchify(corpus.valid, test_batch_size, args)\ntest_data = batchify(corpus.test, test_batch_size, args)\n\n###############################################################################\n# Build the model\n###############################################################################\n\nntokens = len(corpus.dictionary)\ncriterion = nn.CrossEntropyLoss()\n\ndef one_hot(idx, size, cuda=True):\n    a = np.zeros((1, size), np.float32)\n    a[0][idx] = 1\n    v = Variable(torch.from_numpy(a))\n    if cuda: v = v.cuda()\n    return v\n\ndef evaluate(data_source, batch_size=10, window=args.window):\n    # Turn on evaluation mode which disables dropout.\n    if args.model == 'QRNN': model.reset()\n    model.eval()\n    total_loss = 0\n    ntokens = len(corpus.dictionary)\n    hidden = model.init_hidden(batch_size)\n    next_word_history = None\n    pointer_history = None\n    for i in range(0, data_source.size(0) - 1, args.bptt):\n        if i > 0: print(i, len(data_source), math.exp(total_loss / i))\n        data, targets = get_batch(data_source, i, evaluation=True, args=args)\n        output, hidden, rnn_outs, _ = model(data, hidden, return_h=True)\n        rnn_out = rnn_outs[-1].squeeze()\n        output_flat = output.view(-1, ntokens)\n        ###\n        # Fill pointer history\n        start_idx = len(next_word_history) if next_word_history is not None else 0\n        next_word_history = torch.cat([one_hot(t.data[0], ntokens) for t in targets]) if next_word_history is None else torch.cat([next_word_history, torch.cat([one_hot(t.data[0], ntokens) for t in targets])])\n        #print(next_word_history)\n        pointer_history = Variable(rnn_out.data) if pointer_history is None else torch.cat([pointer_history, Variable(rnn_out.data)], dim=0)\n        #print(pointer_history)\n        ###\n        # Built-in cross entropy\n        # total_loss += len(data) * criterion(output_flat, targets).data[0]\n        ###\n        # Manual cross entropy\n        # softmax_output_flat = torch.nn.functional.softmax(output_flat)\n        # soft = torch.gather(softmax_output_flat, dim=1, index=targets.view(-1, 1))\n        # entropy = -torch.log(soft)\n        # total_loss += len(data) * entropy.mean().data[0]\n        ###\n        # Pointer manual cross entropy\n        loss = 0\n        softmax_output_flat = torch.nn.functional.softmax(output_flat)\n        for idx, vocab_loss in enumerate(softmax_output_flat):\n            p = vocab_loss\n            if start_idx + idx > window:\n                valid_next_word = next_word_history[start_idx + idx - window:start_idx + idx]\n                valid_pointer_history = pointer_history[start_idx + idx - window:start_idx + idx]\n                logits = torch.mv(valid_pointer_history, rnn_out[idx])\n                theta = args.theta\n                ptr_attn = torch.nn.functional.softmax(theta * logits).view(-1, 1)\n                ptr_dist = (ptr_attn.expand_as(valid_next_word) * valid_next_word).sum(0).squeeze()\n                lambdah = args.lambdasm\n                p = lambdah * ptr_dist + (1 - lambdah) * vocab_loss\n            ###\n            target_loss = p[targets[idx].data]\n            loss += (-torch.log(target_loss)).data[0]\n        total_loss += loss / batch_size\n        ###\n        hidden = repackage_hidden(hidden)\n        next_word_history = next_word_history[-window:]\n        pointer_history = pointer_history[-window:]\n    return total_loss / len(data_source)\n\n# Load the best saved model.\nwith open(args.save, 'rb') as f:\n    if not args.cuda:\n        model = torch.load(f, map_location=lambda storage, loc: storage)\n    else:\n        model = torch.load(f)\nprint(model)\n\n# Run on val data.\nval_loss = evaluate(val_data, test_batch_size)\nprint('=' * 89)\nprint('| End of pointer | val loss {:5.2f} | val ppl {:8.2f}'.format(\n    val_loss, math.exp(val_loss)))\nprint('=' * 89)\n\n# Run on test data.\ntest_loss = evaluate(test_data, test_batch_size)\nprint('=' * 89)\nprint('| End of pointer | test loss {:5.2f} | test ppl {:8.2f}'.format(\n    test_loss, math.exp(test_loss)))\nprint('=' * 89)\n"
  },
  {
    "path": "awd-lstm-lm/request_devbox.json",
    "content": "{\n    \"dockerImage\":\"tmiconi_rl\", \n    \"tag\":\"master-test-2018_11_27_11_33_25\",\n    \"cpus\":2.0,\n    \"ramMB\":26000,\n    \"gpus\":1,\n    \"diskMB\":8000,\n    \"cluster\":\"opusprodda3e\",\n    \"environment\":\"devel\",\n    \"user\":\"tmiconi\",\n    \"resourcePool\": \"/ailabs/p1/tmiconi\",\n    \"instances\":1,\n    \"isService\":false,\n    \"cronSchedule\":\"\",\n    \"custom\":{},\n    \"application\":\"testversion\",\n    \"maxRetries\":1,\n    \"constraints\":{\"sku\":\"p40_24gb\"},\n    \"accessTypes\":[],\n    \"dependencies\":[],\n    \"cronCollisionPolicy\":\"CANCEL_NEW\",\n    \"emailOnFail\":[],\n    \"emailOnSucceed\":[]\n}\n"
  },
  {
    "path": "awd-lstm-lm/request_full.json",
    "content": "{\n    \"dockerImage\":\"tmiconi_rl\", \n    \"tag\":\"master-test-2019_1_22_14_38_35\",\n    \"name\":\"PLASTICLSTM_bs6_clip2_cliptype_clip_alphatype_full_modultype_modplasth2mod_modulout_fanout_asgdtime_85_1067n_5r\",\n    \"cpus\":2.0,\n    \"cmdLine\":\"export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 \\u0026\\u0026  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/nvidia/bin:/opt/hadoop/latest/bin \\u0026\\u0026 export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/nvidia/lib64/  \\u0026\\u0026  export LC_ALL=C.UTF-8  \\u0026\\u0026   export LANG=C.UTF-8  \\u0026\\u0026  cd /home/work/ \\u0026\\u0026 bash ./OpusPrepare.sh \\u0026\\u0026 source /.bashrc \\u0026\\u0026 pyenv local 3.5.2 \\u0026\\u0026     python main.py --batch_size 6 --data data/penn --dropouti 0.4 --dropouth 0.25  --epoch 300 --save PTB.pt --wdrop 0 --model PLASTICLSTM --modultype modplasth2mod --modulout fanout  --nhid 1067  --alphatype full --asgdtime 85 --clipval 2.0 --cliptype clip --seed {{mesos.instance}} \",\n    \"ramMB\":25000,\n    \"gpus\":1,\n    \"diskMB\":6000,\n    \"cluster\":\"opusprodda1\",\n    \"environment\":\"devel\",\n    \"user\":\"tmiconi\",\n    \"resourcePool\": \"/ailabs/p1/tmiconi\",\n    \"instances\":5,\n    \"isService\":false,\n    \"cronSchedule\":\"\",\n    \"custom\":{},\n    \"application\":\"testversion\",\n    \"maxRetries\":1,\n    \"constraints\":{\"sku\":\"p6000\"},\n    \"accessTypes\":[],\n    \"dependencies\":[],\n    \"cronCollisionPolicy\":\"CANCEL_NEW\",\n    \"emailOnFail\":[],\n    \"emailOnSucceed\":[]\n}\n"
  },
  {
    "path": "awd-lstm-lm/request_opus.json",
    "content": "{\n    \"dockerImage\":\"tmiconi_rl\", \n    \"tag\":\"master-test-2019_3_13_17_37_3\",\n    \"name\":\"newcode_SqUsq_clp2_PLASTICLSTM_agdiv1150_opus_alphatype_full_modultype_modplasth2mod_modulout_fanout_asgdtime_125_1068n_5run\",\n    \"cpus\":2.0,\n    \"cmdLine\":\"export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 \\u0026\\u0026  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/nvidia/bin:/opt/hadoop/latest/bin \\u0026\\u0026 export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/nvidia/lib64/  \\u0026\\u0026  export LC_ALL=C.UTF-8  \\u0026\\u0026   export LANG=C.UTF-8  \\u0026\\u0026  cd /home/work/ \\u0026\\u0026 apt-get install unzip \\u0026\\u0026 sh ./getdata.sh \\u0026\\u0026 python3 main.py --batch_size 6 --data data/penn --dropouti 0.4 --dropouth 0.25  --epoch 500 --save PTB.pt --wdrop 0 --model PLASTICLSTM --modultype modplasth2mod --modulout fanout --nhid 1068  --alphatype full --asgdtime 125 --agdiv 1150 --seed {{mesos.instance}} \",\n    \"ramMB\":25000,\n    \"gpus\":1,\n    \"diskMB\":6000,\n    \"cluster\":\"opusprodda1\",\n    \"environment\":\"devel\",\n    \"user\":\"tmiconi\",\n    \"resourcePool\": \"/ailabs/p1/tmiconi\",\n    \"instances\":5,\n    \"isService\":false,\n    \"cronSchedule\":\"\",\n    \"custom\":{},\n    \"application\":\"testversion\",\n    \"maxRetries\":1,\n    \"constraints\":{\"sku\":\"p6000\"},\n    \"accessTypes\":[],\n    \"dependencies\":[],\n    \"cronCollisionPolicy\":\"CANCEL_NEW\",\n    \"emailOnFail\":[],\n    \"emailOnSucceed\":[]\n}\n"
  },
  {
    "path": "awd-lstm-lm/request_opus.json.old",
    "content": "{\n    \"dockerImage\":\"tmiconi_rl\", \n    \"tag\":\"master-test-2018_12_11_15_39_4\",\n    \"name\":\"PLSTM_plastin_bs3_clip2_opus_alphatype_perneuron_modultype_modplasth2mod_modulout_fanout_asgdtime_65_1149n_5run\",\n    \"cpus\":2.0,\n    \"cmdLine\":\"export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 \\u0026\\u0026  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/nvidia/bin:/opt/hadoop/latest/bin \\u0026\\u0026 export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/nvidia/lib64/  \\u0026\\u0026  export LC_ALL=C.UTF-8  \\u0026\\u0026   export LANG=C.UTF-8  \\u0026\\u0026  cd /home/work/ \\u0026\\u0026 apt-get install unzip \\u0026\\u0026 sh ./getdata.sh \\u0026\\u0026 python3 main.py --batch_size 3 --data data/penn --dropouti 0.4 --dropouth 0.25  --epoch 300 --save PTB.pt --wdrop 0 --model PLASTICLSTM --modultype modplasth2mod --modulout fanout --nhid 1149  --alphatype perneuron --asgdtime 65 --clipval 2.0 --seed {{mesos.instance}} \",\n    \"ramMB\":25000,\n    \"gpus\":1,\n    \"diskMB\":6000,\n    \"cluster\":\"opusprodda1\",\n    \"environment\":\"devel\",\n    \"user\":\"tmiconi\",\n    \"resourcePool\": \"/ailabs/p1/tmiconi\",\n    \"instances\":5,\n    \"isService\":false,\n    \"cronSchedule\":\"\",\n    \"custom\":{},\n    \"application\":\"testversion\",\n    \"maxRetries\":1,\n    \"constraints\":{\"sku\":\"p6000\"},\n    \"accessTypes\":[],\n    \"dependencies\":[],\n    \"cronCollisionPolicy\":\"CANCEL_NEW\",\n    \"emailOnFail\":[],\n    \"emailOnSucceed\":[]\n}\n"
  },
  {
    "path": "awd-lstm-lm/request_plast.json",
    "content": "{\n    \"dockerImage\":\"tmiconi_rl\", \n    \"tag\":\"master-test-2018_12_11_15_39_4\",\n    \"name\":\"PLSTM_plastin_bs3_clip2_opus_alphatype_perneuron_modultype_nomodul_modulout_single_asgdtime_44_1149n_5run\",\n    \"cpus\":2.0,\n    \"cmdLine\":\"export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 \\u0026\\u0026  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/nvidia/bin:/opt/hadoop/latest/bin \\u0026\\u0026 export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/nvidia/lib64/  \\u0026\\u0026  export LC_ALL=C.UTF-8  \\u0026\\u0026   export LANG=C.UTF-8  \\u0026\\u0026  cd /home/work/ \\u0026\\u0026 apt-get install unzip \\u0026\\u0026 sh ./getdata.sh \\u0026\\u0026 python3 main.py --batch_size 3 --data data/penn --dropouti 0.4 --dropouth 0.25  --epoch 300 --save PTB.pt --wdrop 0 --model PLASTICLSTM --modultype none --modulout single --nhid 1149  --alphatype perneuron --asgdtime 44 --clipval 2.0 --seed {{mesos.instance}} \",\n    \"ramMB\":25000,\n    \"gpus\":1,\n    \"diskMB\":6000,\n    \"cluster\":\"opusprodda1\",\n    \"environment\":\"devel\",\n    \"user\":\"tmiconi\",\n    \"resourcePool\": \"/ailabs/p1/tmiconi\",\n    \"instances\":5,\n    \"isService\":false,\n    \"cronSchedule\":\"\",\n    \"custom\":{},\n    \"application\":\"testversion\",\n    \"maxRetries\":1,\n    \"constraints\":{\"sku\":\"p6000\"},\n    \"accessTypes\":[],\n    \"dependencies\":[],\n    \"cronCollisionPolicy\":\"CANCEL_NEW\",\n    \"emailOnFail\":[],\n    \"emailOnSucceed\":[]\n}\n"
  },
  {
    "path": "awd-lstm-lm/splitcross.py",
    "content": "from collections import defaultdict\n\nimport torch\nimport torch.nn as nn\n\nimport numpy as np\n\n\nclass SplitCrossEntropyLoss(nn.Module):\n    r'''SplitCrossEntropyLoss calculates an approximate softmax'''\n    def __init__(self, hidden_size, splits, verbose=False):\n        # We assume splits is [0, split1, split2, N] where N >= |V|\n        # For example, a vocab of 1000 words may have splits [0] + [100, 500] + [inf]\n        super(SplitCrossEntropyLoss, self).__init__()\n        self.hidden_size = hidden_size\n        self.splits = [0] + splits + [100 * 1000000]\n        self.nsplits = len(self.splits) - 1\n        self.stats = defaultdict(list)\n        self.verbose = verbose\n        # Each of the splits that aren't in the head require a pretend token, we'll call them tombstones\n        # The probability given to this tombstone is the probability of selecting an item from the represented split\n        if self.nsplits > 1:\n            self.tail_vectors = nn.Parameter(torch.zeros(self.nsplits - 1, hidden_size))\n            self.tail_bias = nn.Parameter(torch.zeros(self.nsplits - 1))\n\n    def logprob(self, weight, bias, hiddens, splits=None, softmaxed_head_res=None, verbose=False):\n        # First we perform the first softmax on the head vocabulary and the tombstones\n        if softmaxed_head_res is None:\n            start, end = self.splits[0], self.splits[1]\n            head_weight = None if end - start == 0 else weight[start:end]\n            head_bias = None if end - start == 0 else bias[start:end]\n            # We only add the tombstones if we have more than one split\n            if self.nsplits > 1:\n                head_weight = self.tail_vectors if head_weight is None else torch.cat([head_weight, self.tail_vectors])\n                head_bias = self.tail_bias if head_bias is None else torch.cat([head_bias, self.tail_bias])\n\n            # Perform the softmax calculation for the word vectors in the head for all splits\n            # We need to guard against empty splits as torch.cat does not like random lists\n            head_res = torch.nn.functional.linear(hiddens, head_weight, bias=head_bias)\n            softmaxed_head_res = torch.nn.functional.log_softmax(head_res, dim=-1)\n\n        if splits is None:\n            splits = list(range(self.nsplits))\n\n        results = []\n        running_offset = 0\n        for idx in splits:\n\n            # For those targets in the head (idx == 0) we only need to return their loss\n            if idx == 0:\n                results.append(softmaxed_head_res[:, :-(self.nsplits - 1)])\n\n            # If the target is in one of the splits, the probability is the p(tombstone) * p(word within tombstone)\n            else:\n                start, end = self.splits[idx], self.splits[idx + 1]\n                tail_weight = weight[start:end]\n                tail_bias = bias[start:end]\n\n                # Calculate the softmax for the words in the tombstone\n                tail_res = torch.nn.functional.linear(hiddens, tail_weight, bias=tail_bias)\n\n                # Then we calculate p(tombstone) * p(word in tombstone)\n                # Adding is equivalent to multiplication in log space\n                head_entropy = (softmaxed_head_res[:, -idx]).contiguous()\n                tail_entropy = torch.nn.functional.log_softmax(tail_res, dim=-1)\n                results.append(head_entropy.view(-1, 1) + tail_entropy)\n\n        if len(results) > 1:\n            return torch.cat(results, dim=1)\n        return results[0]\n\n    def split_on_targets(self, hiddens, targets):\n        # Split the targets into those in the head and in the tail\n        split_targets = []\n        split_hiddens = []\n\n        # Determine to which split each element belongs (for each start split value, add 1 if equal or greater)\n        # This method appears slower at least for WT-103 values for approx softmax\n        #masks = [(targets >= self.splits[idx]).view(1, -1) for idx in range(1, self.nsplits)]\n        #mask = torch.sum(torch.cat(masks, dim=0), dim=0)\n        ###\n        # This is equally fast for smaller splits as method below but scales linearly\n        mask = None\n        for idx in range(1, self.nsplits):\n            partial_mask = targets >= self.splits[idx]\n            mask = mask + partial_mask if mask is not None else partial_mask\n        ###\n        #masks = torch.stack([targets] * (self.nsplits - 1))\n        #mask = torch.sum(masks >= self.split_starts, dim=0)\n        for idx in range(self.nsplits):\n            # If there are no splits, avoid costly masked select\n            if self.nsplits == 1:\n                split_targets, split_hiddens = [targets], [hiddens]\n                continue\n            # If all the words are covered by earlier targets, we have empties so later stages don't freak out\n            if sum(len(t) for t in split_targets) == len(targets):\n                split_targets.append([])\n                split_hiddens.append([])\n                continue\n            # Are you in our split?\n            tmp_mask = mask == idx\n            split_targets.append(torch.masked_select(targets, tmp_mask))\n            split_hiddens.append(hiddens.masked_select(tmp_mask.unsqueeze(1).expand_as(hiddens)).view(-1, hiddens.size(1)))\n        return split_targets, split_hiddens\n\n    def forward(self, weight, bias, hiddens, targets, verbose=False):\n        if self.verbose or verbose:\n            for idx in sorted(self.stats):\n                print('{}: {}'.format(idx, int(np.mean(self.stats[idx]))), end=', ')\n            print()\n\n        total_loss = None\n        if len(hiddens.size()) > 2: hiddens = hiddens.view(-1, hiddens.size(2))\n\n        split_targets, split_hiddens = self.split_on_targets(hiddens, targets)\n\n        # First we perform the first softmax on the head vocabulary and the tombstones\n        start, end = self.splits[0], self.splits[1]\n        head_weight = None if end - start == 0 else weight[start:end]\n        head_bias = None if end - start == 0 else bias[start:end]\n\n        # We only add the tombstones if we have more than one split\n        if self.nsplits > 1:\n            head_weight = self.tail_vectors if head_weight is None else torch.cat([head_weight, self.tail_vectors])\n            head_bias = self.tail_bias if head_bias is None else torch.cat([head_bias, self.tail_bias])\n\n        # Perform the softmax calculation for the word vectors in the head for all splits\n        # We need to guard against empty splits as torch.cat does not like random lists\n        combo = torch.cat([split_hiddens[i] for i in range(self.nsplits) if len(split_hiddens[i])])\n        ###\n        all_head_res = torch.nn.functional.linear(combo, head_weight, bias=head_bias)\n        softmaxed_all_head_res = torch.nn.functional.log_softmax(all_head_res, dim=-1)\n        if self.verbose or verbose:\n            self.stats[0].append(combo.size()[0] * head_weight.size()[0])\n\n        running_offset = 0\n        for idx in range(self.nsplits):\n            # If there are no targets for this split, continue\n            if len(split_targets[idx]) == 0: continue\n\n            # For those targets in the head (idx == 0) we only need to return their loss\n            if idx == 0:\n                softmaxed_head_res = softmaxed_all_head_res[running_offset:running_offset + len(split_hiddens[idx])]\n                entropy = -torch.gather(softmaxed_head_res, dim=1, index=split_targets[idx].view(-1, 1))\n            # If the target is in one of the splits, the probability is the p(tombstone) * p(word within tombstone)\n            else:\n                softmaxed_head_res = softmaxed_all_head_res[running_offset:running_offset + len(split_hiddens[idx])]\n\n                if self.verbose or verbose:\n                    start, end = self.splits[idx], self.splits[idx + 1]\n                    tail_weight = weight[start:end]\n                    self.stats[idx].append(split_hiddens[idx].size()[0] * tail_weight.size()[0])\n\n                # Calculate the softmax for the words in the tombstone\n                tail_res = self.logprob(weight, bias, split_hiddens[idx], splits=[idx], softmaxed_head_res=softmaxed_head_res)\n\n                # Then we calculate p(tombstone) * p(word in tombstone)\n                # Adding is equivalent to multiplication in log space\n                head_entropy = softmaxed_head_res[:, -idx]\n                # All indices are shifted - if the first split handles [0,...,499] then the 500th in the second split will be 0 indexed\n                indices = (split_targets[idx] - self.splits[idx]).view(-1, 1)\n                # Warning: if you don't squeeze, you get an N x 1 return, which acts oddly with broadcasting\n                tail_entropy = torch.gather(torch.nn.functional.log_softmax(tail_res, dim=-1), dim=1, index=indices).squeeze()\n                entropy = -(head_entropy + tail_entropy)\n            ###\n            running_offset += len(split_hiddens[idx])\n            total_loss = entropy.float().sum() if total_loss is None else total_loss + entropy.float().sum()\n\n        return (total_loss / len(targets)).type_as(weight)\n\n\nif __name__ == '__main__':\n    np.random.seed(42)\n    torch.manual_seed(42)\n    if torch.cuda.is_available():\n        torch.cuda.manual_seed(42)\n\n    V = 8\n    H = 10\n    N = 100\n    E = 10\n\n    embed = torch.nn.Embedding(V, H)\n    crit = SplitCrossEntropyLoss(hidden_size=H, splits=[V // 2])\n    bias = torch.nn.Parameter(torch.ones(V))\n    optimizer = torch.optim.SGD(list(embed.parameters()) + list(crit.parameters()), lr=1)\n\n    for _ in range(E):\n        prev = torch.autograd.Variable((torch.rand(N, 1) * 0.999 * V).int().long())\n        x = torch.autograd.Variable((torch.rand(N, 1) * 0.999 * V).int().long())\n        y = embed(prev).squeeze()\n        c = crit(embed.weight, bias, y, x.view(N))\n        print('Crit', c.exp().data[0])\n\n        logprobs = crit.logprob(embed.weight, bias, y[:2]).exp()\n        print(logprobs)\n        print(logprobs.sum(dim=1))\n\n        optimizer.zero_grad()\n        c.backward()\n        optimizer.step()\n"
  },
  {
    "path": "awd-lstm-lm/test.py",
    "content": "import OpusHdfsCopy\nfrom OpusHdfsCopy import transferFileToHdfsDir, checkHdfs\nimport argparse\nimport time\nimport math\nimport numpy as np\nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\n\nimport pdb\n\nimport data\nimport model\n\nfrom utils import batchify, get_batch, repackage_hidden\n\ntorch.nn.Module.dump_patches=True\n\nparser = argparse.ArgumentParser(description='PyTorch PennTreeBank RNN/LSTM Language Model Testing of Saved Models')\nparser.add_argument('--file', type=str, default='',\n                    help='name of the file containing the saved model to be tested')\nparser.add_argument('--data', type=str, default='data/penn/',\n                    help='location of the data corpus')\nparser.add_argument('--model', type=str, default='LSTM',\n                    help='type of recurrent net (LSTM, QRNN, GRU)')\nparser.add_argument('--alphatype', type=str, default='full',\n        help=\"type of alpha matrix: (full, fanout)\")\nparser.add_argument('--modultype', type=str, default='none',\n        help=\"type of modulation: (none, modplasth2mod, modplastc2mod)\")\nparser.add_argument('--modulout', type=str, default='single',\n        help=\"modulatory output (single or fanout)\")\nparser.add_argument('--cliptype', type=str, default='clip',\n                    help=\"clip type (decay, clip, aditya)\")\nparser.add_argument('--hebboutput', type=str, default='i2c',\n                    help='output used for hebbian computations (i2c, h2co, cell, hidden)')\nparser.add_argument('--emsize', type=int, default=400,\n                    help='size of word embeddings')\nparser.add_argument('--nhid', type=int, default=1150,\n                    help='number of hidden units per layer')\nparser.add_argument('--nlayers', type=int, default=3,\n                    help='number of layers')\nparser.add_argument('--lr', type=float, default=30,\n                    help='initial learning rate')\nparser.add_argument('--clip', type=float, default=0.25,\n                    help='gradient clipping')\nparser.add_argument('--numgpu', type=int, default=0,\n                    help='which GPU to use? (no effect if GPU not used at all)')\nparser.add_argument('--epochs', type=int, default=8000,\n                    help='upper epoch limit')\nparser.add_argument('--batch_size', type=int, default=80, metavar='N',\n                    help='batch size')\nparser.add_argument('--bptt', type=int, default=70,\n                    help='sequence length')\nparser.add_argument('--dropout', type=float, default=0.4,\n                    help='dropout applied to layers (0 = no dropout)')\nparser.add_argument('--dropouth', type=float, default=0.3,\n                    help='dropout for rnn layers (0 = no dropout)')\nparser.add_argument('--dropouti', type=float, default=0.65,\n                    help='dropout for input embedding layers (0 = no dropout)')\nparser.add_argument('--dropoute', type=float, default=0.1,\n                    help='dropout to remove words from embedding layer (0 = no dropout)')\nparser.add_argument('--wdrop', type=float, default=0.5,\n                    help='amount of weight dropout to apply to the RNN hidden to hidden matrix')\nparser.add_argument('--seed', type=int, default=1111,\n                    help='random seed')\nparser.add_argument('--nonmono', type=int, default=5,\n                    help='random seed')\nparser.add_argument('--cuda', action='store_false',\n                    help='use CUDA')\nparser.add_argument('--log-interval', type=int, default=200, metavar='N',\n                    help='report interval')\nrandomhash = ''.join(str(time.time()).split('.'))\nparser.add_argument('--save', type=str,  default=randomhash+'.pt',\n                    help='path to save the final model')\nparser.add_argument('--alpha', type=float, default=2,\n                    help='alpha L2 regularization on RNN activation (alpha = 0 means no regularization)')\nparser.add_argument('--beta', type=float, default=1,\n                    help='beta slowness regularization applied on RNN activiation (beta = 0 means no regularization)')\nparser.add_argument('--wdecay', type=float, default=1.2e-6,\n                    help='weight decay applied to all weights')\nparser.add_argument('--resume', type=str,  default='',\n                    help='path of model to resume')\nparser.add_argument('--optimizer', type=str,  default='sgd',\n                    help='optimizer to use (sgd, adam)')\nparser.add_argument('--when', nargs=\"+\", type=int, default=[-1],\n                    help='When (which epochs) to divide the learning rate by 10 - accepts multiple')\nargs = parser.parse_args()\nargs.tied = True\n\n# Set the random seed manually for reproducibility.\nnp.random.seed(args.seed)\ntorch.manual_seed(args.seed)\nif torch.cuda.is_available():\n    if not args.cuda:\n        print(\"WARNING: You have a CUDA device, so you should probably run with --cuda\")\n    else:\n        torch.cuda.manual_seed(args.seed)\n\n###############################################################################\n# Load data\n###############################################################################\n\ndef model_save(fn):\n    with open(fn, 'wb') as f:\n        torch.save([model, criterion, optimizer], f)\n\ndef model_load(fn):\n    global model, criterion, optimizer\n    with open(fn, 'rb') as f:\n        model, criterion, optimizer = torch.load(f, map_location=torch.device(args.numgpu))\n\nimport platform\nprint(\"Torch version:\", torch.__version__, \"Numpy version:\", np.version.version, \"Python version:\", platform.python_version())\n\nimport os\nimport hashlib\nfn = 'corpus.{}.data'.format(hashlib.md5(args.data.encode()).hexdigest())\nif os.path.exists(fn):\n    print('Loading cached dataset...')\n    corpus = torch.load(fn)\nelse:\n    print('Producing dataset...')\n    corpus = data.Corpus(args.data)\n    torch.save(corpus, fn)\n\neval_batch_size = 10\ntest_batch_size = 1\ntrain_data = batchify(corpus.train, args.batch_size, args)\nval_data = batchify(corpus.valid, eval_batch_size, args)\ntest_data = batchify(corpus.test, test_batch_size, args)\n\n\n#train_data = train_data[:5000,:]   # For debugging\n\n###############################################################################\n# Build the model\n###############################################################################\n\nfrom splitcross import SplitCrossEntropyLoss\ncriterion = None\n\nntokens = len(corpus.dictionary)\nmyparams={}\nmyparams['cliptype'] = args.cliptype\nmyparams['modultype'] = args.modultype\nmyparams['modulout'] = args.modulout\nmyparams['hebboutput'] = args.hebboutput\nmyparams['alphatype'] = args.alphatype\n\nsuffix = args.model+'_'+myparams['cliptype']+'_'+myparams['modultype']+'_'+myparams['modulout']+'_'+myparams['hebboutput']+'_'+myparams['alphatype']+'_lr'+str(args.lr)+'_'+str(args.nlayers)+'l_'+str(args.nhid)+'h'\nRESULTSFILENAME = 'results_'+suffix+'.txt'\n\nMODELFILENAME = args.file\n\n###\nif not criterion:\n    splits = []\n    if ntokens > 500000:\n        # One Billion\n        # This produces fairly even matrix mults for the buckets:\n        # 0: 11723136, 1: 10854630, 2: 11270961, 3: 11219422\n        splits = [4200, 35000, 180000]\n    elif ntokens > 75000:\n        # WikiText-103\n        splits = [2800, 20000, 76000]\n    print('Using', splits)\n    criterion = SplitCrossEntropyLoss(args.emsize, splits=splits, verbose=False)\n###\n#params = list(model.parameters()) + list(criterion.parameters())\n#if args.cuda:\n#    model = model.cuda()\n#    criterion = criterion.cuda()\n#    params = list(model.parameters()) + list(criterion.parameters())\n####\n#total_params = sum(x.size()[0] * x.size()[1] if len(x.size()) > 1 else x.size()[0] for x in params if x.size())\n#print('Args:', args)\n#print('Model total parameters:', total_params)\n\n###############################################################################\n# Training code\n###############################################################################\n\ndef evaluate(data_source, batch_size=10):\n    # Turn on evaluation mode which disables dropout.\n    model.eval()\n    with torch.no_grad():\n        if args.model == 'QRNN': model.reset()\n        total_loss = 0\n        ntokens = len(corpus.dictionary)\n        hidden = model.init_hidden(batch_size)\n        for i in range(0, data_source.size(0) - 1, args.bptt):\n            data, targets = get_batch(data_source, i, args, evaluation=True)\n            output, hidden = model(data, hidden)\n            total_loss += len(data) * criterion(model.decoder.weight, model.decoder.bias, output, targets).data\n            hidden = repackage_hidden(hidden)\n        #return total_loss[0] / len(data_source)\n    return total_loss / len(data_source)\n\n\n# Loop over epochs.\nlr = args.lr\nbest_val_loss = []\nstored_loss = 100000000\n\nprint(\"MyParams:\", myparams)\nprint(\"Args:\", args)\n\n# Load the best saved model.\nmodel_load(MODELFILENAME)\n\n\nNUMGPU = args.numgpu\nparams = list(model.parameters()) + list(criterion.parameters())\nif args.cuda:\n    model = model.cuda(device=NUMGPU)\n    criterion = criterion.cuda(device=NUMGPU)\n    params = list(model.parameters()) + list(criterion.parameters())\n###\ntotal_params = sum(x.numel() for x in params)#  if x.numel())\nprint('Args:', args)\nprint('Model total parameters:', total_params)\n\n#pdb.set_trace()\n\n# Run on test data.\ntest_loss = evaluate(test_data, test_batch_size)\nprint('=' * 89)\nprint('| End of training | test loss {:5.2f} | test ppl {:8.2f} | test bpc {:8.3f}'.format(\n    test_loss, math.exp(test_loss), test_loss / math.log(2)))\nprint('=' * 89)\n"
  },
  {
    "path": "awd-lstm-lm/tmp.py",
    "content": "import torch\nfrom torch import nn\nfrom torch.autograd import Variable\nimport torch.nn.functional as F\nimport numpy as np\n\n\nimport pdb\n\n\nclass PlasticLSTM(nn.Module):\n    def __init__(self, isize, hsize, params):\n        super(PlasticLSTM, self).__init__()\n        self.softmax= torch.nn.functional.softmax\n        #if params['activ'] == 'tanh':\n        self.activ = F.tanh\n\n        ok=0\n        if 'cliptype' in params:\n            self.cliptype = params['cliptype']\n            ok+=1\n        if 'modultype' in params:\n            self.modultype = params['modultype']\n            ok+=1\n        if 'hebboutput' in params:\n            self.hebboutput = params['hebboutput']\n            ok+=1\n        if 'modulout' in params:\n            self.modulout= params['modulout']\n            ok+=1\n        if 'alphatype' in params:\n            self.alphatype= params['alphatype']\n            ok+=1\n        if ok < 5:\n            raise ValueError('When using PlasticLSTM, must specify cliptype, modultype, modulout, alphatype and hebboutput in params')\n\n        # Plastic connection parameters:\n        self.w =  torch.nn.Parameter(.02 * torch.rand(hsize, hsize) - .01)\n        if self.alphatype == 'fanout':\n            self.alpha = torch.nn.Parameter(.001 * torch.ones(1)) #torch.rand(1,1,hsize))\n        else:\n            self.alpha =  torch.nn.Parameter(.00001 * torch.rand(hsize, hsize))\n        if self.modultype == 'none':\n            self.eta = torch.nn.Parameter(.01 * torch.ones(1))  # Everyone has the same eta (Note: if a parameter is not actually used, there can be problems with ASGD handling in main.py) \n        #self.eta = .01\n        \n        self.h2f = torch.nn.Linear(hsize, hsize)\n        self.h2i = torch.nn.Linear(hsize, hsize)\n        self.h2opt = torch.nn.Linear(hsize, hsize)\n        #self.h2c = torch.nn.Linear(hsize, hsize)  # This (equivalent to Whg in the PyTorch docs, Uc in Wikipedia) is replaced by the plastic connection\n        self.x2f = torch.nn.Linear(isize, hsize)\n        self.x2opt = torch.nn.Linear(isize, hsize)\n        self.x2i = torch.nn.Linear(isize, hsize)\n        self.x2c = torch.nn.Linear(isize, hsize)\n        \n        if self.modultype != 'none':\n            self.h2mod = torch.nn.Linear(hsize, 1)  # Although called 'h2mod', it may take input from h or c depending on modultype value\n        if self.modulout == 'fanout':\n            self.modfanout = torch.nn.Linear(1, hsize)  \n        \n        self.isize = isize\n        self.hsize = hsize\n\n\n    def forward(self, inputs, hidden): #, hebb, et, pw):  # hidden is a tuple of h, c and hebb\n        \n        hebb = hidden[2]\n        fgt = F.sigmoid(self.x2f(inputs) + self.h2f(hidden[0]))\n        ipt = F.sigmoid(self.x2i(inputs) + self.h2i(hidden[0]))\n        opt = F.sigmoid(self.x2opt(inputs) + self.h2opt(hidden[0]))\n        #cell = torch.mul(fgt, hidden[1]) + torch.mul(ipt, F.tanh(self.x2c(inputs) + self.h2c(hidden[0])))\n        \n        # To implement plasticity, we replace h2c / Whg / Uc with a plastic connection composed of w, alpha and hebb\n        # Note that h2c / Whg / Uc is the matrix of weights that takes in the\n        # previous time-step h, and whose output (after adding the current input \n        # and passing through tanh) is multiplied by the input gates before being \n        # added to the cell state\n        if self.cliptype == 'aditya':\n            # Each *column* in w, hebb and alpha constitutes the inputs to a single cell\n            # For w and alpha, columns are 2nd dimension (i.e. dim 1); for hebb, it's dimension 2 (dimension 0 is batch)\n            h2coutput = hidden[0].unsqueeze(1).bmm(self.w + torch.mul(self.alpha, torch.clamp(hebb, min=-1.0, max=1.0))).squeeze()  \n        else:\n            h2coutput = hidden[0].unsqueeze(1).bmm(self.w + torch.mul(self.alpha, hebb)).squeeze()  \n            #if np.random.rand() < .1:\n            #    pdb.set_trace()\n        inputstocell =  F.tanh(self.x2c(inputs) + h2coutput)\n        #inputstocell =  F.tanh(self.x2c(inputs) + torch.matmul(hidden[0].unsqueeze(1), self.w.unsqueeze(0)).squeeze(1)) \n        cell = torch.mul(fgt, hidden[1]) + torch.mul(ipt, inputstocell) #  self.h2c(hidden[0])))\n\n        \n        #pdb.set_trace()\n        \n        hactiv = torch.mul(opt, F.tanh(cell))\n        #pdb.set_trace()\n        \n        if self.hebboutput == 'i2c':\n            deltahebb = torch.bmm(hidden[0].unsqueeze(2), inputstocell.unsqueeze(1))\n        elif self.hebboutput == 'h2co': \n            deltahebb = torch.bmm(hidden[0].unsqueeze(2), h2coutput.unsqueeze(1))\n        elif self.hebboutput == 'cell': \n            deltahebb = torch.bmm(hidden[0].unsqueeze(2), cell.unsqueeze(1))\n        elif self.hebboutput == 'hidden': \n            deltahebb = torch.bmm(hidden[0].unsqueeze(2), hactiv.unsqueeze(1)) \n        else: \n            raise ValueError(\"Must choose Hebbian target output\")\n\n        if self.modultype == 'none':\n            myeta = self.eta\n        elif self.modultype == 'modplasth2mod':\n            myeta = F.tanh(self.h2mod(hactiv)).unsqueeze(2)  # Shape: BatchSize x 1 x 1\n        elif self.modultype == 'modplastc2mod':\n            myeta = F.tanh(self.h2mod(cell)).unsqueeze(2)\n        else: \n            raise ValueError(\"Must choose modulation type\")\n        \n        #pdb.set_trace()\n        if self.modultype != 'none' and self.modulout == 'fanout':\n            # Each *column* in w, hebb and alpha constitutes the inputs to a single cell\n            # For w and alpha, columns are 2nd dimension (i.e. dim 1); for hebb, it's dimension 2 (dimension 0 is batch)\n            # The output of the following line has shape BatchSize x 1 x NHidden, i.e. 1 line and NHidden columns for each \n            # batch element. When multiplying by hebb (BatchSize x NHidden x NHidden), broadcasting will provide a different\n            # value for each cell but the same value for all inputs of a cell, as required by fanout concept.\n             myeta = self.modfanout(myeta).squeeze().unsqueeze(1)              \n\n        if self.cliptype == 'decay':\n            hebb = (1 - myeta) * hebb + myeta * deltahebb\n        elif self.cliptype == 'clip':\n            hebb = torch.clamp(hebb + myeta * deltahebb, min=-1.0, max=1.0)\n        elif self.cliptype == 'aditya':\n            hebb = hebb + myeta * deltahebb   \n        else: \n            raise ValueError(\"Must choose clip type\")\n\n        hidden = (hactiv, cell, hebb)\n        activout = hactiv #self.h2o(hactiv)\n        #if np.isnan(np.sum(hactiv.data.cpu().numpy())) or np.isnan(np.sum(hidden[1].data.cpu().numpy())) :\n        #    raise ValueError(\"Nan detected !\")\n\n        return activout, hidden #, hebb, et, pw\n\n\n\nclass MyLSTM(nn.Module):\n    def __init__(self, isize, hsize):\n        super(MyLSTM, self).__init__()\n        self.softmax= torch.nn.functional.softmax\n        #if params['activ'] == 'tanh':\n        self.activ = F.tanh\n        self.h2f = torch.nn.Linear(hsize, hsize)\n        self.h2i = torch.nn.Linear(hsize, hsize)\n        self.h2opt = torch.nn.Linear(hsize, hsize)\n        self.h2c = torch.nn.Linear(hsize, hsize)\n        self.x2f = torch.nn.Linear(isize, hsize)\n        self.x2opt = torch.nn.Linear(isize, hsize)\n        self.x2i = torch.nn.Linear(isize, hsize)\n        self.x2c = torch.nn.Linear(isize, hsize)\n        self.isize = isize\n        self.hsize = hsize\n\n\n    def forward(self, inputs, hidden): #, hebb, et, pw):  # hidden is a tuple of h and c states\n            \n        fgt = F.sigmoid(self.x2f(inputs) + self.h2f(hidden[0]))\n        ipt = F.sigmoid(self.x2i(inputs) + self.h2i(hidden[0]))\n        opt = F.sigmoid(self.x2opt(inputs) + self.h2opt(hidden[0]))\n        cell = torch.mul(fgt, hidden[1]) + torch.mul(ipt, F.tanh(self.x2c(inputs) + self.h2c(hidden[0])))\n        hactiv = torch.mul(opt, F.tanh(cell))\n        #pdb.set_trace()\n        hidden = (hactiv, cell)\n        activout = hactiv #self.h2o(hactiv)\n        #if np.isnan(np.sum(hactiv.data.cpu().numpy())) or np.isnan(np.sum(hidden[1].data.cpu().numpy())) :\n        #    raise ValueError(\"Nan detected !\")\n\n        #pdb.set_trace()\n\n        return activout, hidden #, hebb, et, pw\n\n\n\n"
  },
  {
    "path": "awd-lstm-lm/utils.py",
    "content": "import torch\n#from torch.autograd import Variable\n\ndef repackage_hidden(h):\n    \"\"\"Wraps hidden states in new Tensors, to detach them from their history.\"\"\"\n    #if type(h) == Variable:\n        #return Variable(h.data)\n    if isinstance(h, torch.Tensor):\n        return h.detach()\n    else:\n        return tuple(repackage_hidden(v) for v in h)\n\ndef batchify(data, bsz, args):\n    # Work out how cleanly we can divide the dataset into bsz parts.\n    nbatch = data.size(0) // bsz\n    # Trim off any extra elements that wouldn't cleanly fit (remainders).\n    data = data.narrow(0, 0, nbatch * bsz)\n    # Evenly divide the data across the bsz batches.\n    data = data.view(bsz, -1).t().contiguous()\n    if args.cuda:\n        data = data.cuda(device=args.numgpu)\n    return data\n\ndef get_batch(source, i, args, seq_len=None, evaluation=False):\n    seq_len = min(seq_len if seq_len else args.bptt, len(source) - 1 - i)\n    data = source[i:i+seq_len]\n    target = source[i+1:i+1+seq_len].view(-1)\n    return data, target\n"
  },
  {
    "path": "awd-lstm-lm/weight_drop.py",
    "content": "import torch\nfrom torch.nn import Parameter\nfrom functools import wraps\n\nclass WeightDrop(torch.nn.Module):\n    def __init__(self, module, weights, dropout=0, variational=False):\n        super(WeightDrop, self).__init__()\n        self.module = module\n        self.weights = weights\n        self.dropout = dropout\n        self.variational = variational\n        self._setup()\n\n    def widget_demagnetizer_y2k_edition(*args, **kwargs):\n        # We need to replace flatten_parameters with a nothing function\n        # It must be a function rather than a lambda as otherwise pickling explodes\n        # We can't write boring code though, so ... WIDGET DEMAGNETIZER Y2K EDITION!\n        # (╯°□°）╯︵ ┻━┻\n        return\n\n    def _setup(self):\n        # Terrible temporary solution to an issue regarding compacting weights re: CUDNN RNN\n        if issubclass(type(self.module), torch.nn.RNNBase):\n            self.module.flatten_parameters = self.widget_demagnetizer_y2k_edition\n\n        for name_w in self.weights:\n            print('Applying weight drop of {} to {}'.format(self.dropout, name_w))\n            w = getattr(self.module, name_w)\n            del self.module._parameters[name_w]\n            self.module.register_parameter(name_w + '_raw', Parameter(w.data))\n\n    def _setweights(self):\n        for name_w in self.weights:\n            raw_w = getattr(self.module, name_w + '_raw')\n            w = None\n            if self.variational:\n                mask = torch.autograd.Variable(torch.ones(raw_w.size(0), 1))\n                if raw_w.is_cuda: mask = mask.cuda()\n                mask = torch.nn.functional.dropout(mask, p=self.dropout, training=True)\n                w = mask.expand_as(raw_w) * raw_w\n            else:\n                w = torch.nn.functional.dropout(raw_w, p=self.dropout, training=self.training)\n            setattr(self.module, name_w, w)\n\n    def forward(self, *args):\n        self._setweights()\n        return self.module.forward(*args)\n\nif __name__ == '__main__':\n    import torch\n    from weight_drop import WeightDrop\n\n    # Input is (seq, batch, input)\n    x = torch.autograd.Variable(torch.randn(2, 1, 10)).cuda()\n    h0 = None\n\n    ###\n\n    print('Testing WeightDrop')\n    print('=-=-=-=-=-=-=-=-=-=')\n\n    ###\n\n    print('Testing WeightDrop with Linear')\n\n    lin = WeightDrop(torch.nn.Linear(10, 10), ['weight'], dropout=0.9)\n    lin.cuda()\n    run1 = [x.sum() for x in lin(x).data]\n    run2 = [x.sum() for x in lin(x).data]\n\n    print('All items should be different')\n    print('Run 1:', run1)\n    print('Run 2:', run2)\n\n    assert run1[0] != run2[0]\n    assert run1[1] != run2[1]\n\n    print('---')\n\n    ###\n\n    print('Testing WeightDrop with LSTM')\n\n    wdrnn = WeightDrop(torch.nn.LSTM(10, 10), ['weight_hh_l0'], dropout=0.9)\n    wdrnn.cuda()\n\n    run1 = [x.sum() for x in wdrnn(x, h0)[0].data]\n    run2 = [x.sum() for x in wdrnn(x, h0)[0].data]\n\n    print('First timesteps should be equal, all others should differ')\n    print('Run 1:', run1)\n    print('Run 2:', run2)\n\n    # First time step, not influenced by hidden to hidden weights, should be equal\n    assert run1[0] == run2[0]\n    # Second step should not\n    assert run1[1] != run2[1]\n\n    print('---')\n"
  },
  {
    "path": "images/OpusHdfsCopy.py",
    "content": "# Uber-only code for interacting with hdfs\n#\n# Copyright (c) 2018 Uber Technologies, Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#    http://www.apache.org/licenses/LICENSE-2.0\n#\n#    Unless required by applicable law or agreed to in writing, software\n#    distributed under the License is distributed on an \"AS IS\" BASIS,\n#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n#    See the License for the specific language governing permissions and\n#    limitations under the License.\n\nimport os\nimport os.path\n\ndef checkHdfs():\n    return os.path.isfile('/opt/hadoop/latest/bin/hdfs')\n\ndef transferFileToHdfsPath(sourcepath, targetpath):\n    hdfspath = targetpath\n    targetdir = os.path.dirname(targetpath)\n    os.system('/opt/hadoop/latest/bin/hdfs dfs -mkdir -p {}'.format(targetdir))\n    result = os.system(\n        '/opt/hadoop/latest/bin/hdfs dfs -copyFromLocal -f {} {}'.format(sourcepath, hdfspath)\n    )\n    if result != 0:\n        raise OSError('Cannot copyFromLocal {} {} returned {}'.format(sourcepath, hdfspath, result))\n\ndef transferFileToHdfsDir(sourcepath, targetdir):\n    hdfspath = os.path.join(targetdir, os.path.basename(sourcepath))\n    os.system('/opt/hadoop/latest/bin/hdfs dfs -mkdir -p {}'.format(targetdir))\n    result = os.system(\n        '/opt/hadoop/latest/bin/hdfs dfs -copyFromLocal -f {} {}'.format(sourcepath, hdfspath)\n    )\n    if result != 0:\n        raise OSError('Cannot copyFromLocal {} {} returned {}'.format(sourcepath, hdfspath, result))\n\n"
  },
  {
    "path": "images/README.md",
    "content": "## Images\n\nThis code implements the image completion task: three images are shown several times, then one of the image is half-erased and presented, and the network must reconstruct the missing portion of the image.\n\nTo run this code, you must download the [CIFAR10 dataset](https://www.cs.toronto.edu/~kriz/cifar.html) (Python version), and copy the `data_batch_*` files into this directory.\n"
  },
  {
    "path": "images/anim.py",
    "content": "# Make an animation from the activities of the network over time\n#\n# Copyright (c) 2018 Uber Technologies, Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#    http://www.apache.org/licenses/LICENSE-2.0\n#\n#    Unless required by applicable law or agreed to in writing, software\n#    distributed under the License is distributed on an \"AS IS\" BASIS,\n#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n#    See the License for the specific language governing permissions and\n#    limitations under the License.\n\n\nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\nimport numpy as np\nfrom numpy import random\nimport torch.nn.functional as F\nimport scipy\nimport scipy.misc\nfrom torch import optim\nimport random\nimport sys\nimport pickle\nimport pdb\nimport time\n\nimport numpy as np\nimport matplotlib.pyplot as plt\nimport matplotlib.animation as animation\nimport glob\n\nnp.set_printoptions(precision=3)\n\n\nimport images as pics\nfrom images import Network\n\nfig = plt.figure()\nplt.axis('off')\n\n# Note that this is a different file from the ones used in training\nwith open('./data_batch_5', 'rb') as fo:\n    imagedict = pickle.load(fo, encoding='bytes')\nimagedata = imagedict[b'data']\n\n#suffix = 'eta_prestime_20_probadegrade_0.5_interpresdelay_2_learningrate_0.0001_prestimetest_3_rngseed_0_nbiter_50000_nbprescycles_3_inputboost_1.0_eta_0.01_nbpatterns_3_patternsize_1024' # This one used for first draft of the paper, rngseed 4\n#suffix = 'eta_inputboost_1.0_learningrate_0.0001_nbprescycles_3_interpresdelay_2_eta_0.01_rngseed_0_probadegrade_0.5_nbiter_150000_nbpatterns_3_prestimetest_3_patternsize_1024_prestime_20'\n#suffix=\"eta_nbpatterns_3_inputboost_1.0_nbprescycles_3_prestime_20_prestimetest_5_interpresdelay_2_patternsize_1024_nbiter_50000_probadegrade_0.5_learningrate_0.0001_eta_0.01_rngseed_0\"\n\nsuffix='etarefiner_eta_0.01_nbpatterns_3_interpresdelay_2_patternsize_1024_prestime_20_learningrate_1e-05_nbprescycles_3_rngseed_0_prestimetest_3_probadegrade_0.5_inputboost_1.0_nbiter_150000'\n\n\n#fn = './tmp/results_'+suffix+'.dat'\nfn = './results_'+suffix+'.dat'\nwith open(fn, 'rb') as fo:\n    myw = pickle.load(fo)\n    myalpha = pickle.load(fo)\n    myeta = pickle.load(fo)\n    myall_losses = pickle.load(fo)\n    myparams = pickle.load(fo)\n\nnet = Network(myparams)\n\n\n#np.random.seed(params['rngseed']); random.seed(params['rngseed']); torch.manual_seed(params['rngseed'])\n#rngseed=18\n#rngseed=4\nrngseed=7\nnp.random.seed(rngseed); random.seed(rngseed); torch.manual_seed(rngseed)\n\n#print myall_losses\n\nttype = torch.cuda.FloatTensor # Must match the one in pics_eta.py\n#ttype = torch.FloatTensor # Must match the one in pics_eta.py\n\nnet.w.data = torch.from_numpy(myw).type(ttype)\nnet.alpha.data = torch.from_numpy(myalpha).type(ttype)\nnet.eta.data = torch.from_numpy(myeta).type(ttype)\nprint(net.w.data[:10,:10])\nprint(net.eta.data)\n\nNBPICS = 1 # 10 \nnn=1\n\nimagesize = int(np.sqrt(myparams['patternsize']))\noutputs={}\nFILLINGSTEPS = myparams['prestimetest'] + myparams['interpresdelay'] + 1\n\n\n\n\n\n# Two ways to do it : show the full actual process, or show a \"simnplified\" version where you just show the three images and the pattern completion (slowed down)\n\nSIMPLIFIED = 0\n\nif SIMPLIFIED:\n\n    for numpic in range(NBPICS):\n\n        print(\"Pattern\", numpic)\n\n        z = np.random.rand()\n        z = np.random.rand()\n\n        inputsTensor, targetPattern = pics.generateInputsAndTarget(myparams, contiguousperturbation=True)\n\n        y = net.initialZeroState()\n        hebb = net.initialZeroHebb()\n        net.zeroDiagAlpha()\n\n        ax_imgs = []\n\n        print(\"Running the episode...\")\n        for numstep in range(myparams['nbsteps']):\n            y, hebb = net(Variable(inputsTensor[numstep], requires_grad=False), y, hebb)\n            output = y.data.cpu().numpy()[0][:-1].reshape((imagesize, imagesize))\n            #output = scipy.misc.imresize(output, 4.0)\n            #plt.subplot(NBPICS, FILLINGSTEPS, nn)\n            #plt.axis('off')\n            #plt.imshow(output, cmap='gray', vmin=-1.0, vmax=1.0)\n\n            #if numstep == 1  or numstep == myparams['prestime'] + myparams['interpresdelay'] + 1 or  \\\n                    #numstep == 2 * (myparams['prestime'] + myparams['interpresdelay']) + 1 or \\\n\n\n            # Show the last set of 3 patterns, and the completion:\n            if numstep ==  myparams['nbsteps'] - myparams['prestimetest'] - myparams['interpresdelay'] - 2 or \\\n                    numstep ==  myparams['nbsteps'] - myparams['prestimetest'] - (myparams['interpresdelay'] + myparams['prestime']) - myparams['interpresdelay'] - 2 or \\\n                    numstep ==  myparams['nbsteps'] - myparams['prestimetest'] - (myparams['interpresdelay'] + myparams['prestime']) *2 - myparams['interpresdelay'] - 2  or \\\n                    numstep >= myparams['nbsteps'] - myparams['prestimetest'] :\n                if numstep == myparams['nbsteps'] - myparams['prestimetest'] :\n                    output_half = output.copy()\n                    output_half[16:,:] = 0      # NOTE: we are assuming that the grayed part will be the bottom one, which is only true for half the cases\n                    a1 = plt.imshow(output_half, animated=True, cmap='gray', vmin=-1.0, vmax=1.0)\n                else:\n                    a1 = plt.imshow(output, animated=True, cmap='gray', vmin=-1.0, vmax=1.0)\n                #a2 = plt.text(1, 1, str(numstep)+\"/\"+str(myparams['nbsteps']), fontsize=12, color='r')\n                if numstep < myparams['nbsteps'] - myparams['prestimetest'] :\n                        a3 = plt.text(1, 1,  \"Pattern \"+str(nn), fontsize=12, color='r')\n                else:\n                        a3 = plt.text(1, 1, \"Pattern completion\", fontsize=12, color='r')\n                ax_imgs.append([a1, a3])  \n                #ax_imgs.append([fullimg])  \n                nn += 1\n                #scipy.misc.imsave('pic'+str(numpic)+'_'+str(numstep)+'.png', output)\n\n\n    #plt.show(block=True)\n        print(\"Writing out the animation file\")\n        anim = animation.ArtistAnimation(fig, ax_imgs, repeat_delay=2000)  # repeat_delay is ignored...\n        anim.save('anim_short_'+str(numpic)+'.gif', writer='imagemagick', fps=1)\n       \n        \n        # All images could be  rotated 90deg. This allows us to display each set as a\n        # vertical column by rotating the final image 90 degrees too.\n\n        #output = y.data.cpu().numpy()[0][:-1].reshape((imagesize, imagesize))\n        #pattern1 = inputsTensor.cpu().numpy()[0][0][:-1].reshape((imagesize, imagesize))\n        #pattern2 = inputsTensor.cpu().numpy()[myparams['prestime']+myparams['interpresdelay']+1][0][:-1].reshape((imagesize, imagesize))\n        #pattern3 = inputsTensor.cpu().numpy()[2*(myparams['prestime']+myparams['interpresdelay'])+1][0][:-1].reshape((imagesize, imagesize))\n        #blankedpattern = inputsTensor.cpu().numpy()[-1][0][:-1].reshape((imagesize, imagesize))\n\n        #plt.subplot(NBPICS,5,nn)\n        #plt.axis('off')\n        #plt.imshow(pattern1, cmap='gray', vmin=-1.0, vmax=1.0)\n        #plt.subplot(NBPICS,5,nn+1)\n        #plt.axis('off')\n        #plt.imshow(pattern2, cmap='gray', vmin=-1.0, vmax=1.0)\n        #plt.subplot(NBPICS,5,nn+2)\n        #plt.axis('off')\n        #plt.imshow(pattern3, cmap='gray', vmin=-1.0, vmax=1.0)\n        #plt.subplot(NBPICS,5,nn+3)\n        #plt.axis('off')\n        #plt.imshow(blankedpattern, cmap='gray', vmin=-1.0, vmax=1.0)\n        #plt.subplot(NBPICS,5,nn+4)\n        #plt.imshow(output, cmap='gray', vmin=-1.0, vmax=1.0)\n        #plt.axis('off')\n        #nn += 5\n\n        #td = targetPattern.cpu().numpy()\n        #yd = y.data.cpu().numpy()[0][:-1]\n        #absdiff = np.abs(td-yd)\n        #print(\"Mean / median / max abs diff:\", np.mean(absdiff), np.median(absdiff), np.max(absdiff))\n        #print(\"Correlation (full / sign): \", np.corrcoef(td, yd)[0][1], np.corrcoef(np.sign(td), np.sign(yd))[0][1])\n        ##print inputs[numstep]\n    #plt.subplots_adjust(wspace=.1, hspace=.1)\n\n\nelse:\n    for numpic in range(NBPICS):\n\n        print(\"Pattern\", numpic)\n\n        z = np.random.rand()\n        z = np.random.rand()\n\n        inputsTensor, targetPattern = pics.generateInputsAndTarget(myparams, contiguousperturbation=True)\n\n        y = net.initialZeroState()\n        hebb = net.initialZeroHebb()\n        net.zeroDiagAlpha()\n\n        ax_imgs = []\n\n        print(\"Running the episode...\")\n        for numstep in range(myparams['nbsteps']):\n            y, hebb = net(Variable(inputsTensor[numstep], requires_grad=False), y, hebb)\n            output = y.data.cpu().numpy()[0][:-1].reshape((imagesize, imagesize))\n            #output = scipy.misc.imresize(output, 4.0)\n            #plt.subplot(NBPICS, FILLINGSTEPS, nn)\n            #plt.axis('off')\n            #plt.imshow(output, cmap='gray', vmin=-1.0, vmax=1.0)\n            a1 = plt.imshow(output, animated=True, cmap='gray', vmin=-1.0, vmax=1.0)\n            a2 = plt.text(1, 1, str(numstep)+\"/\"+str(myparams['nbsteps']), fontsize=12, color='r')\n            if numstep < myparams['nbsteps'] - myparams['prestimetest'] -  1:\n                a3 = plt.text(14, 1,  \"Pattern presentations\", fontsize=12, color='r')\n            else:\n                a3 = plt.text(14, 1, \"Pattern completion\", fontsize=12, color='r')\n            ax_imgs.append([a1, a2, a3])  \n            #ax_imgs.append([fullimg])  \n            nn += 1\n            #scipy.misc.imsave('pic'+str(numpic)+'_'+str(numstep)+'.png', output)\n\n        # Post-completion, keep the last image up a bit \n        for numstep_add in range(50):\n            a1 = plt.imshow(output, animated=True, cmap='gray', vmin=-1.0, vmax=1.0)\n            a2 = plt.text(1, 1, str(myparams['nbsteps'])+\"/\"+str(myparams['nbsteps']), fontsize=12, color='r')\n            a3 = plt.text(14, 1, \"Pattern completion\", fontsize=12, color='r')\n            ax_imgs.append([a1, a2, a3])  \n\n\n\n    #plt.show(block=True)\n        print(\"Writing out the animation file\")\n        anim = animation.ArtistAnimation(fig, ax_imgs, repeat_delay=2000)  # repeat_delay is ignored...\n        anim.save('anim_full_'+str(numpic)+'.gif', writer='imagemagick', fps=10)\n       \n        \n        # All images could be  rotated 90deg. This allows us to display each set as a\n        # vertical column by rotating the final image 90 degrees too.\n\n        #output = y.data.cpu().numpy()[0][:-1].reshape((imagesize, imagesize))\n        #pattern1 = inputsTensor.cpu().numpy()[0][0][:-1].reshape((imagesize, imagesize))\n        #pattern2 = inputsTensor.cpu().numpy()[myparams['prestime']+myparams['interpresdelay']+1][0][:-1].reshape((imagesize, imagesize))\n        #pattern3 = inputsTensor.cpu().numpy()[2*(myparams['prestime']+myparams['interpresdelay'])+1][0][:-1].reshape((imagesize, imagesize))\n        #blankedpattern = inputsTensor.cpu().numpy()[-1][0][:-1].reshape((imagesize, imagesize))\n\n        #plt.subplot(NBPICS,5,nn)\n        #plt.axis('off')\n        #plt.imshow(pattern1, cmap='gray', vmin=-1.0, vmax=1.0)\n        #plt.subplot(NBPICS,5,nn+1)\n        #plt.axis('off')\n        #plt.imshow(pattern2, cmap='gray', vmin=-1.0, vmax=1.0)\n        #plt.subplot(NBPICS,5,nn+2)\n        #plt.axis('off')\n        #plt.imshow(pattern3, cmap='gray', vmin=-1.0, vmax=1.0)\n        #plt.subplot(NBPICS,5,nn+3)\n        #plt.axis('off')\n        #plt.imshow(blankedpattern, cmap='gray', vmin=-1.0, vmax=1.0)\n        #plt.subplot(NBPICS,5,nn+4)\n        #plt.imshow(output, cmap='gray', vmin=-1.0, vmax=1.0)\n        #plt.axis('off')\n        #nn += 5\n\n        #td = targetPattern.cpu().numpy()\n        #yd = y.data.cpu().numpy()[0][:-1]\n        #absdiff = np.abs(td-yd)\n        #print(\"Mean / median / max abs diff:\", np.mean(absdiff), np.median(absdiff), np.max(absdiff))\n        #print(\"Correlation (full / sign): \", np.corrcoef(td, yd)[0][1], np.corrcoef(np.sign(td), np.sign(yd))[0][1])\n        ##print inputs[numstep]\n    #plt.subplots_adjust(wspace=.1, hspace=.1)\n"
  },
  {
    "path": "images/images.py",
    "content": "# Differentiable plasticity: natural image memorization and reconstruction.\n#\n# Copyright (c) 2018 Uber Technologies, Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#    http://www.apache.org/licenses/LICENSE-2.0\n#\n#    Unless required by applicable law or agreed to in writing, software\n#    distributed under the License is distributed on an \"AS IS\" BASIS,\n#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n#    See the License for the specific language governing permissions and\n#    limitations under the License.\n\n\n# This program uses the click module rather than argparse to scan command-line arguments. I won't do that again. \n\n# You start getting acceptable results after ~3000 episodes (~15 minutes with a standard GPU). Let it run longer for better results.\n\n# To observe the results, run testpics.py (which uses the output files produced by this program)\n\n\n\n\nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\nimport click\nimport numpy as np\nfrom numpy import random\nimport torch.nn.functional as F\nfrom torch import optim\nimport random\nimport sys\nimport pickle\nimport pdb\nimport time\nimport os\nimport platform\n# Uber-only:\n#import OpusHdfsCopy\n#from OpusHdfsCopy import transferFileToHdfsDir, checkHdfs\n\n\n# Loading the image data. This requires downloading the CIFAR 10 dataset (Python version) - https://www.cs.toronto.edu/~kriz/cifar.html\nimagedata=np.zeros((0, 1024*3))\nfor numfile in range(4):\n    with open('./data_batch_'+str(numfile+1), 'rb') as fo:\n        #imagedict = pickle.load(fo)  # Python 2\n        imagedict = pickle.load(fo, encoding='bytes')  # Python 3\n    imagedata = np.concatenate((imagedata, imagedict[b'data']), axis=0)\n\nnp.set_printoptions(precision=4)\n\n\ndefaultParams = {\n    'nbpatterns': 3,        # number of images per episode\n    'nbprescycles': 3,      # number of presentations for each image\n    'prestime': 20,         # number of time steps for each image presentation\n    'prestimetest': 3,      # number of time steps for the test (degraded) image\n    'interpresdelay': 2,    # number of time steps (with zero input) between two presentations\n    'patternsize': 1024,    # size of the images (32 x 32 = 1024)\n    'nbiter': 100000,       # number of episodes\n    'probadegrade': .5,     # when contiguousperturbation is False (which it shouldn't be), probability of zeroing each pixel in the test image\n    'lr': 1e-4,   # Adam learning rate\n    'print_every': 10,      # how often to print statistics and save files\n    'homogenous': 0,        # whether alpha should be shared across connections \n    'rngseed':0             # random seed\n}\n\n\n#ttype = torch.FloatTensor;         # For CPU\nttype = torch.cuda.FloatTensor;     # For GPU\n\n\n# Generate the full list of inputs for an episode\ndef generateInputsAndTarget(params, contiguousperturbation=True):\n    #print((\"Input Boost:\", params['inputboost']))\n    inputT = np.zeros((params['nbsteps'], 1, params['nbneur'])) #inputTensor, initially in numpy format...\n    # Create the random patterns to be memorized in an episode\n    # Floating-point, graded patterns, zero-mean\n    patterns=[]\n    for nump in range(params['nbpatterns']):\n        numpic = np.random.randint(imagedata.shape[0])\n        p = imagedata[numpic].reshape((3, 1024)).sum(0).astype(float)\n        p = p[:params['patternsize']]\n        p = p - np.mean(p)\n        p = p / (1e-8+np.max(np.abs(p)))\n        #p = (np.random.randint(2, size=params['patternsize']) - .5) *2   # Binary patterns\n        patterns.append(p)\n    #print \"patterns generated!\"\n    # Now 'patterns' contains the NBPATTERNS patterns to be memorized in this episode - in numpy format\n    # Creating the test pattern, partially zero'ed out, that the network will have to complete\n    testpattern = random.choice(patterns).copy()\n    preservedbits = np.ones(params['patternsize'])\n    \n    if contiguousperturbation: # Contiguous perturbation = one contiguous half of the image is zeroed out. Default (see above).\n        preservedbits[int(params['patternsize']/2):] = 0\n        if np.random.rand() < .5:\n            preservedbits = 1 - preservedbits\n    else: # Otherwise, randomly zero out individual pixels. Because natural images are highly autocorrelated, a trivial approximate solution is to take the average of nearby pixels.\n        preservedbits[:int(params['probadegrade'] * params['patternsize'])] = 0; np.random.shuffle(preservedbits)\n    degradedtestpattern = testpattern * preservedbits\n\n    # Inserting the inputs in the input tensor at the proper places\n    for nc in range(params['nbprescycles']):\n        np.random.shuffle(patterns)\n        for ii in range(params['nbpatterns']):\n            for nn in range(params['prestime']):\n                numi =nc * (params['nbpatterns'] * (params['prestime']+params['interpresdelay'])) + ii * (params['prestime']+params['interpresdelay']) + nn\n                inputT[numi][0][:params['patternsize']] = patterns[ii][:]\n\n    for nn in range(params['prestimetest']):\n        inputT[-params['prestimetest'] + nn][0][:params['patternsize']] = degradedtestpattern[:]\n\n    for nn in range(params['nbsteps']):\n        inputT[nn][0][-1] = 1.0  # Bias neuron is forced to 1\n        #inputT[nn] *= params['inputboost']       # Strengthen inputs\n\n    inputT = torch.from_numpy(inputT).type(ttype)  # Convert from numpy to Tensor\n    target = torch.from_numpy(testpattern).type(ttype)\n\n    return inputT, target\n\n\n\nclass Network(nn.Module):\n    def __init__(self, params):\n        super(Network, self).__init__()\n        # Notice that the vectors are row vectors, and the matrices are transposed wrt the comp neuro order, following deep learning / pytorch conventions\n        # Each *column* of w targets a single output neuron\n        self.w = Variable(.01 * torch.randn(params['nbneur'], params['nbneur']).type(ttype), requires_grad=True)        # fixed (baseline) weights\n        if params['homogenous'] == 1:\n            self.alpha = Variable(.01 * torch.ones(1).type(ttype), requires_grad=True)                                  # plasticity coefficients: homogenous/shared across connections\n        else:\n            self.alpha = Variable(.01 * torch.randn(params['nbneur'], params['nbneur']).type(ttype),requires_grad=True) # plasticity coefficients: independent\n        self.eta = Variable(.01 * torch.ones(1).type(ttype), requires_grad=True)                            # \"learning rate\" of plasticity, shared across all connections\n        self.params = params\n\n    def forward(self, input, yin, hebb):\n        # Inputs are fed by clamping the output of cells that receive input at the input value, like in standard Hopfield networks\n        # clamps = torch.zeros(1, self.params['nbneur'])\n        clamps = np.zeros(self.params['nbneur'])\n        zz = torch.nonzero(input.data[0].cpu()).numpy().squeeze()\n        #print(zz, zz.shape)\n        clamps[zz] = 1\n        #print(clamps)\n        clamps = Variable(torch.from_numpy(clamps).type(ttype), requires_grad=False).float()\n        yout = F.tanh( yin.mm(self.w + torch.mul(self.alpha, hebb))) * (1 - clamps) + input * clamps\n        hebb = (1 - self.eta) * hebb + self.eta * torch.bmm(yin.unsqueeze(2), yout.unsqueeze(1))[0] # bmm used to implement outer product\n        return yout, hebb\n\n    def initialZeroState(self):\n        return Variable(torch.zeros(1, self.params['nbneur']).type(ttype))\n\n    def initialZeroHebb(self):\n        return Variable(torch.zeros(self.params['nbneur'], self.params['nbneur']).type(ttype))\n\n\ndef train(paramdict=None):\n    #params = dict(click.get_current_context().params)\n    print(\"Starting training...\")\n    params = {}\n    params.update(defaultParams)\n    if paramdict:\n        params.update(paramdict)\n    print(\"Passed params: \", params)\n    print(platform.uname())\n    sys.stdout.flush()\n    params['nbsteps'] = params['nbprescycles'] * ((params['prestime'] + params['interpresdelay']) * params['nbpatterns']) + params['prestimetest']  # Total number of steps per episode\n    params['nbneur'] = params['patternsize'] + 1\n    suffix = \"images_\"+\"\".join([str(x)+\"_\" if pair[0] is not 'nbneur' and pair[0] is not 'nbsteps' and pair[0] is not 'print_every' and pair[0] is not 'rngseed' else '' for pair in zip(params.keys(), params.values()) for x in pair])[:-1] + '_rngseed_'+str(params['rngseed'])   # Turning the parameters into a nice suffix for filenames; rngseed always appears last\n\n    # Initialize random seeds (first two redundant?)\n    print(\"Setting random seeds\")\n    np.random.seed(params['rngseed']); random.seed(params['rngseed']); torch.manual_seed(params['rngseed'])\n    #print(click.get_current_context().params)\n    \n    print(\"Initializing network\")\n    net = Network(params)\n    total_loss = 0.0\n    \n    print(\"Initializing optimizer\")\n    optimizer = torch.optim.Adam([net.w, net.alpha, net.eta], lr=params['lr'])\n    all_losses = []\n    #print_every = 20\n    nowtime = time.time()\n    print(\"Starting episodes...\")\n    sys.stdout.flush()\n\n    for numiter in range(params['nbiter']):\n        # print(\"Iter \", numiter)\n        # sys.stdout.flush()\n        y = net.initialZeroState()\n        hebb = net.initialZeroHebb()\n        optimizer.zero_grad()\n\n        inputs, target = generateInputsAndTarget(params)\n\n        # Running the episode\n        for numstep in range(params['nbsteps']):\n            y, hebb = net(Variable(inputs[numstep], requires_grad=False), y, hebb)\n\n\n        # Computing gradients, applying optimizer\n        loss = (y[0][:params['patternsize']] - Variable(target, requires_grad=False)).pow(2).sum()\n        loss.backward()\n        optimizer.step()\n\n        lossnum = loss.data[0]\n        total_loss  += lossnum\n\n\n        # Printing statistics, saving files\n        if (numiter+1) % params['print_every'] == 0:\n\n            print(numiter, \"====\")\n            td = target.cpu().numpy()\n            yd = y.data.cpu().numpy()[0][:-1]\n            print(\"y: \", yd[:10])\n            print(\"target: \", td[:10])\n            #print(\"target: \", target.unsqueeze(0)[0][:10])\n            absdiff = np.abs(td-yd)\n            print(\"Mean / median / max abs diff:\", np.mean(absdiff), np.median(absdiff), np.max(absdiff))\n            print(\"Correlation (full / sign): \", np.corrcoef(td, yd)[0][1], np.corrcoef(np.sign(td), np.sign(yd))[0][1])\n            #print inputs[numstep]\n            previoustime = nowtime\n            nowtime = time.time()\n            print(\"Time spent on last\", params['print_every'], \"iters: \", nowtime - previoustime)\n            total_loss /= params['print_every']\n            all_losses.append(total_loss)\n            print(\"Mean loss over last\", params['print_every'], \"iters:\", total_loss)\n            print(\"Saving local files...\")\n            sys.stdout.flush()\n            with open('results_'+suffix+'.dat', 'wb') as fo:\n                pickle.dump(net.w.data.cpu().numpy(), fo)\n                pickle.dump(net.alpha.data.cpu().numpy(), fo)\n                pickle.dump(net.eta.data.cpu().numpy(), fo)\n                pickle.dump(all_losses, fo)\n                pickle.dump(params, fo)\n            print(\"ETA:\", net.eta.data.cpu().numpy())\n            with open('loss_'+suffix+'.txt', 'w') as thefile:\n                for item in all_losses:\n                    thefile.write(\"%s\\n\" % item)\n            # Uber-only\n            #print(\"Saving HDFS files...\")\n            #if checkHdfs():\n            #    print(\"Transfering to HDFS...\")\n            #    transferFileToHdfsDir('results_'+suffix+'.dat', '/ailabs/tmiconi/exp/')\n            #    transferFileToHdfsDir('loss_'+suffix+'.txt', '/ailabs/tmiconi/exp/')\n            sys.stdout.flush()\n            sys.stderr.flush()\n\n            total_loss = 0\n\n\n@click.command()\n@click.option('--nbpatterns', default=defaultParams['nbpatterns'])\n@click.option('--nbprescycles', default=defaultParams['nbprescycles'])\n@click.option('--homogenous', default=defaultParams['prestime'])\n@click.option('--prestime', default=defaultParams['prestime'])\n@click.option('--prestimetest', default=defaultParams['prestimetest'])\n@click.option('--interpresdelay', default=defaultParams['interpresdelay'])\n@click.option('--patternsize', default=defaultParams['patternsize'])\n@click.option('--nbiter', default=defaultParams['nbiter'])\n@click.option('--probadegrade', default=defaultParams['probadegrade'])\n@click.option('--lr', default=defaultParams['lr'])\n@click.option('--print_every', default=defaultParams['print_every'])\n@click.option('--rngseed', default=defaultParams['rngseed'])\ndef main(nbpatterns, nbprescycles, homogenous, prestime, prestimetest, interpresdelay, patternsize, nbiter, probadegrade, lr, print_every, rngseed):\n    train(paramdict=dict(click.get_current_context().params))\n    #print(dict(click.get_current_context().params))\n\nif __name__ == \"__main__\":\n    #train()\n    main()\n\n"
  },
  {
    "path": "images/plotresults.py",
    "content": "import numpy as np\nimport glob\nimport matplotlib.pyplot as plt\n\nfnames = glob.glob('./tmp/loss_simple_*.txt')\n#fnames = glob.glob('./tmp/loss_api_*.txt')\n#fnames = glob.glob('./tmp/loss_fixed_*.txt')\n\n\nplt.ion()\nplt.figure(figsize=(5,4))  # Smaller figure = relative larger fonts\n\nfulllosses=[]\nlosses=[]\nlgts=[]\nfor fn in fnames:\n    z = np.loadtxt(fn)\n    lgts.append(len(z))\n    fulllosses.append(z)\nminlen = min(lgts)\nfor z in fulllosses:\n    losses.append(z[:minlen])\n\nlosses = np.array(losses)\nmeanl = np.mean(losses, axis=0)\nstdl = np.std(losses, axis=0)\n\nhighl = np.max(losses, axis=0)\nlowl = np.min(losses, axis=0)\n#highl = meanl+stdl\n#lowl = meanl-stdl\n\nxx = range(len(meanl))\n\n# xticks and labels\nxt = range(0, len(meanl), 50)\nxtl = [str(10*i) for i in xt]\n\nplt.fill_between(xx, lowl, highl, color='blue', alpha=.5)\nplt.plot(meanl, color='blue')\n#plt.xlabel('Loss (sum square diff. b/w final output and target)')\nplt.xlabel('Number of Episodes')\nplt.ylabel('Loss')\nplt.xticks(xt, xtl)\nplt.tight_layout()\n\n\n"
  },
  {
    "path": "images/request.json",
    "content": "{\n    \"dockerImage\":\"test_tm\", \n    \"tag\":\"master-test-2017_10_31_17_22_28\",\n    \"name\":\"PicsAPIToCompareWithFixed\",\n    \"cpus\":2.0,\n    \"cmdLine\":\"export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 \\u0026\\u0026  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/nvidia/bin:/opt/hadoop/latest/bin \\u0026\\u0026 export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/nvidia/lib64/  \\u0026\\u0026  export LC_ALL=C.UTF-8  \\u0026\\u0026   export LANG=C.UTF-8  \\u0026\\u0026  cd /home/work/ \\u0026\\u0026 python3 pics_api.py --rngseed {{mesos.instance}}\",\n    \"ramMB\":8000,\n    \"gpus\":1,\n    \"diskMB\":1000,\n    \"cluster\":\"opusprodda1\",\n    \"environment\":\"devel\",\n    \"user\":\"root\",\n    \"instances\":10,\n    \"isService\":false,\n    \"cronSchedule\":\"\",\n    \"custom\":{},\n    \"application\":\"testversion\",\n    \"maxRetries\":3,\n    \"constraints\":{\"sku\":\"1080ti\"},\n    \"accessTypes\":[],\n    \"dependencies\":[],\n    \"cronCollisionPolicy\":\"CANCEL_NEW\",\n    \"emailOnFail\":[],\n    \"emailOnSucceed\":[]\n}\n"
  },
  {
    "path": "images/showcompletion_eta.py",
    "content": "# Old code to show the dynamics of pattern completion : show the product of the network at each time step\n# Useful to understand how the network works (i.e. the need to clear up remnant activity from previous stimuli)\n# May require adjustments to work (e.g. change file names, etc.)\n#\n# Copyright (c) 2018 Uber Technologies, Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#    http://www.apache.org/licenses/LICENSE-2.0\n#\n#    Unless required by applicable law or agreed to in writing, software\n#    distributed under the License is distributed on an \"AS IS\" BASIS,\n#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n#    See the License for the specific language governing permissions and\n#    limitations under the License.\n\n\nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\nimport numpy as np\nfrom numpy import random\nimport torch.nn.functional as F\nimport scipy\nimport scipy.misc\nfrom torch import optim\nimport random\nimport sys\nimport pickle\nimport pdb\nimport time\n\nnp.set_printoptions(precision=3)\nimport matplotlib.pyplot as plt\nplt.ion()\n\n\nimport images as pics\nfrom images import Network\n\n#plt.figure()\n\n# Note that this is a different file from the ones used in training\nwith open('../data_batch_5', 'rb') as fo:\n    imagedict = pickle.load(fo, encoding='bytes')\nimagedata = imagedict[b'data']\n\n#suffix = 'eta_prestime_20_probadegrade_0.5_interpresdelay_2_learningrate_0.0001_prestimetest_3_rngseed_0_nbiter_50000_nbprescycles_3_inputboost_1.0_eta_0.01_nbpatterns_3_patternsize_1024' # This one used for first draft of the paper, rngseed 4\n#suffix = 'eta_inputboost_1.0_learningrate_0.0001_nbprescycles_3_interpresdelay_2_eta_0.01_rngseed_0_probadegrade_0.5_nbiter_150000_nbpatterns_3_prestimetest_3_patternsize_1024_prestime_20'\n#suffix=\"eta_nbpatterns_3_inputboost_1.0_nbprescycles_3_prestime_20_prestimetest_5_interpresdelay_2_patternsize_1024_nbiter_50000_probadegrade_0.5_learningrate_0.0001_eta_0.01_rngseed_0\"\n\nsuffix='etarefiner_eta_0.01_nbpatterns_3_interpresdelay_2_patternsize_1024_prestime_20_learningrate_1e-05_nbprescycles_3_rngseed_0_prestimetest_3_probadegrade_0.5_inputboost_1.0_nbiter_150000'\n\n\n#fn = './tmp/results_'+suffix+'.dat'\nfn = './results_'+suffix+'.dat'\nwith open(fn, 'rb') as fo:\n    myw = pickle.load(fo)\n    myalpha = pickle.load(fo)\n    myeta = pickle.load(fo)\n    myall_losses = pickle.load(fo)\n    myparams = pickle.load(fo)\n\nnet = Network(myparams)\n\n#np.random.seed(params['rngseed']); random.seed(params['rngseed']); torch.manual_seed(params['rngseed'])\n#rngseed=4\nrngseed=7\nnp.random.seed(rngseed); random.seed(rngseed); torch.manual_seed(rngseed)\n\n#print myall_losses\n\nttype = torch.cuda.FloatTensor # Must match the one in pics_eta.py\n#ttype = torch.FloatTensor # Must match the one in pics_eta.py\n\nnet.w.data = torch.from_numpy(myw).type(ttype)\nnet.alpha.data = torch.from_numpy(myalpha).type(ttype)\nnet.eta.data = torch.from_numpy(myeta).type(ttype)\nprint(net.w.data[:10,:10])\nprint(net.eta.data)\n\nNBPICS = 10\nnn=1\n\nimagesize = int(np.sqrt(myparams['patternsize']))\noutputs={}\nplt.figure()\nFILLINGSTEPS = myparams['prestimetest'] + myparams['interpresdelay'] + 1\n\nfor numpic in range(NBPICS):\n\n    print(\"Pattern\", numpic)\n\n    z = np.random.rand()\n    z = np.random.rand()\n\n    inputsTensor, targetPattern = pics.generateInputsAndTarget(myparams, contiguousperturbation=True)\n\n    y = net.initialZeroState()\n    hebb = net.initialZeroHebb()\n    net.zeroDiagAlpha()\n\n    for numstep in range(myparams['nbsteps']):\n        y, hebb = net(Variable(inputsTensor[numstep], requires_grad=False), y, hebb)\n        if numstep >= myparams['nbsteps'] - FILLINGSTEPS:\n            output = y.data.cpu().numpy()[0][:-1].reshape((imagesize, imagesize))\n            #output = scipy.misc.imresize(output, 4.0)\n            plt.subplot(NBPICS, FILLINGSTEPS, nn)\n            plt.axis('off')\n            plt.imshow(output, cmap='gray', vmin=-1.0, vmax=1.0)\n            nn += 1\n            #scipy.misc.imsave('pic'+str(numpic)+'_'+str(numstep)+'.png', output)\n\n\nplt.show(block=True)\n   \n    \n    # All images could be  rotated 90deg. This allows us to display each set as a\n    # vertical column by rotating the final image 90 degrees too.\n\n    #output = y.data.cpu().numpy()[0][:-1].reshape((imagesize, imagesize))\n    #pattern1 = inputsTensor.cpu().numpy()[0][0][:-1].reshape((imagesize, imagesize))\n    #pattern2 = inputsTensor.cpu().numpy()[myparams['prestime']+myparams['interpresdelay']+1][0][:-1].reshape((imagesize, imagesize))\n    #pattern3 = inputsTensor.cpu().numpy()[2*(myparams['prestime']+myparams['interpresdelay'])+1][0][:-1].reshape((imagesize, imagesize))\n    #blankedpattern = inputsTensor.cpu().numpy()[-1][0][:-1].reshape((imagesize, imagesize))\n\n    #plt.subplot(NBPICS,5,nn)\n    #plt.axis('off')\n    #plt.imshow(pattern1, cmap='gray', vmin=-1.0, vmax=1.0)\n    #plt.subplot(NBPICS,5,nn+1)\n    #plt.axis('off')\n    #plt.imshow(pattern2, cmap='gray', vmin=-1.0, vmax=1.0)\n    #plt.subplot(NBPICS,5,nn+2)\n    #plt.axis('off')\n    #plt.imshow(pattern3, cmap='gray', vmin=-1.0, vmax=1.0)\n    #plt.subplot(NBPICS,5,nn+3)\n    #plt.axis('off')\n    #plt.imshow(blankedpattern, cmap='gray', vmin=-1.0, vmax=1.0)\n    #plt.subplot(NBPICS,5,nn+4)\n    #plt.imshow(output, cmap='gray', vmin=-1.0, vmax=1.0)\n    #plt.axis('off')\n    #nn += 5\n\n    #td = targetPattern.cpu().numpy()\n    #yd = y.data.cpu().numpy()[0][:-1]\n    #absdiff = np.abs(td-yd)\n    #print(\"Mean / median / max abs diff:\", np.mean(absdiff), np.median(absdiff), np.max(absdiff))\n    #print(\"Correlation (full / sign): \", np.corrcoef(td, yd)[0][1], np.corrcoef(np.sign(td), np.sign(yd))[0][1])\n    ##print inputs[numstep]\n#plt.subplots_adjust(wspace=.1, hspace=.1)\n"
  },
  {
    "path": "images/testpics.py",
    "content": "# Generate a figure that shows a number of episodes\n#\n# Copyright (c) 2018 Uber Technologies, Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#    http://www.apache.org/licenses/LICENSE-2.0\n#\n#    Unless required by applicable law or agreed to in writing, software\n#    distributed under the License is distributed on an \"AS IS\" BASIS,\n#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n#    See the License for the specific language governing permissions and\n#    limitations under the License.\n\n\n\n\nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\nimport numpy as np\nfrom numpy import random\nimport torch.nn.functional as F\nfrom torch import optim\nimport random\nimport sys\nimport pickle\nimport pdb\nimport time\n\nnp.set_printoptions(precision=3)\nimport matplotlib.pyplot as plt\nplt.ion()\n\n\nimport images as pics\nfrom images import Network\n\nplt.figure()\n\n# Note that this is a different file from the ones used in training\nwith open('./data_batch_5', 'rb') as fo:\n    imagedict = pickle.load(fo, encoding='bytes')\nimagedata = imagedict[b'data']\n\nsuffix='images_patternsize_1024_interpresdelay_2_nbpatterns_3_lr_0.0001_nbprescycles_3_homogenous_20_nbiter_100000_prestime_20_probadegrade_0.5_prestimetest_3_rngseed_0'\n#fn = './tmp/results_'+suffix+'.dat'\nfn = './results_'+suffix+'.dat'\nwith open(fn, 'rb') as fo:\n    myw = pickle.load(fo)\n    myalpha = pickle.load(fo)\n    myeta = pickle.load(fo)\n    myall_losses = pickle.load(fo)\n    myparams = pickle.load(fo)\n\nnet = Network(myparams)\n\n#np.random.seed(params['rngseed']); random.seed(params['rngseed']); torch.manual_seed(params['rngseed'])\n#rngseed=4\nrngseed=4\nnp.random.seed(rngseed); random.seed(rngseed); torch.manual_seed(rngseed)\n\n#print myall_losses\n\nttype = torch.cuda.FloatTensor # Must match the one in pics_eta.py\n#ttype = torch.FloatTensor # Must match the one in pics_eta.py\n\nnet.w.data = torch.from_numpy(myw).type(ttype)\nnet.alpha.data = torch.from_numpy(myalpha).type(ttype)\nnet.eta.data = torch.from_numpy(myeta).type(ttype)\nprint(net.w.data[:10,:10])\nprint(net.eta.data)\n\nNBPICS = 7\nnn=1\nfor numpic in range(NBPICS):\n\n    print(\"Pattern\", numpic)\n\n    inputsTensor, targetPattern = pics.generateInputsAndTarget(myparams, contiguousperturbation=True)\n\n    y = net.initialZeroState()\n    hebb = net.initialZeroHebb()\n    #net.zeroDiagAlpha()\n\n    for numstep in range(myparams['nbsteps']):\n        y, hebb = net(Variable(inputsTensor[numstep], requires_grad=False), y, hebb)\n\n   \n    \n    # All images could be  rotated 90deg. This allows us to display each set as a\n    # vertical column by rotating the final image 90 degrees too.\n\n    imagesize = int(np.sqrt(myparams['patternsize']))\n    output = y.data.cpu().numpy()[0][:-1].reshape((imagesize, imagesize))\n    pattern1 = inputsTensor.cpu().numpy()[0][0][:-1].reshape((imagesize, imagesize))\n    pattern2 = inputsTensor.cpu().numpy()[myparams['prestime']+myparams['interpresdelay']+1][0][:-1].reshape((imagesize, imagesize))\n    pattern3 = inputsTensor.cpu().numpy()[2*(myparams['prestime']+myparams['interpresdelay'])+1][0][:-1].reshape((imagesize, imagesize))\n    blankedpattern = inputsTensor.cpu().numpy()[-1][0][:-1].reshape((imagesize, imagesize))\n    #output = y.data.cpu().numpy()[0][:-1].reshape((imagesize, imagesize)).T\n    #pattern1 = inputsTensor.cpu().numpy()[0][0][:-1].reshape((imagesize, imagesize)).T\n    #pattern2 = inputsTensor.cpu().numpy()[myparams['prestime']+myparams['interpresdelay']+1][0][:-1].reshape((imagesize, imagesize)).T\n    #pattern3 = inputsTensor.cpu().numpy()[2*(myparams['prestime']+myparams['interpresdelay'])+1][0][:-1].reshape((imagesize, imagesize)).T\n    #blankedpattern = inputsTensor.cpu().numpy()[-1][0][:-1].reshape((imagesize, imagesize)).T\n\n    plt.subplot(NBPICS,5,nn)\n    plt.axis('off')\n    plt.imshow(pattern1, cmap='gray', vmin=-1.0, vmax=1.0)\n    plt.subplot(NBPICS,5,nn+1)\n    plt.axis('off')\n    plt.imshow(pattern2, cmap='gray', vmin=-1.0, vmax=1.0)\n    plt.subplot(NBPICS,5,nn+2)\n    plt.axis('off')\n    plt.imshow(pattern3, cmap='gray', vmin=-1.0, vmax=1.0)\n    plt.subplot(NBPICS,5,nn+3)\n    plt.axis('off')\n    plt.imshow(blankedpattern, cmap='gray', vmin=-1.0, vmax=1.0)\n    plt.subplot(NBPICS,5,nn+4)\n    plt.imshow(output, cmap='gray', vmin=-1.0, vmax=1.0)\n    plt.axis('off')\n    nn += 5\n\n    td = targetPattern.cpu().numpy()\n    yd = y.data.cpu().numpy()[0][:-1]\n    absdiff = np.abs(td-yd)\n    print(\"Mean / median / max abs diff:\", np.mean(absdiff), np.median(absdiff), np.max(absdiff))\n    print(\"Correlation (full / sign): \", np.corrcoef(td, yd)[0][1], np.corrcoef(np.sign(td), np.sign(yd))[0][1])\n    #print inputs[numstep]\nplt.subplots_adjust(wspace=.1, hspace=.1)\n"
  },
  {
    "path": "maze/OpusHdfsCopy.py",
    "content": "# Uber-only code for interacting with hdfs\n#\n# Copyright (c) 2018 Uber Technologies, Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#    http://www.apache.org/licenses/LICENSE-2.0\n#\n#    Unless required by applicable law or agreed to in writing, software\n#    distributed under the License is distributed on an \"AS IS\" BASIS,\n#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n#    See the License for the specific language governing permissions and\n#    limitations under the License.\n\nimport os\nimport os.path\n\ndef checkHdfs():\n    return os.path.isfile('/opt/hadoop/latest/bin/hdfs')\n\ndef transferFileToHdfsPath(sourcepath, targetpath):\n    hdfspath = targetpath\n    targetdir = os.path.dirname(targetpath)\n    os.system('/opt/hadoop/latest/bin/hdfs dfs -mkdir -p {}'.format(targetdir))\n    result = os.system(\n        '/opt/hadoop/latest/bin/hdfs dfs -copyFromLocal -f {} {}'.format(sourcepath, hdfspath)\n    )\n    if result != 0:\n        raise OSError('Cannot copyFromLocal {} {} returned {}'.format(sourcepath, hdfspath, result))\n\ndef transferFileToHdfsDir(sourcepath, targetdir):\n    hdfspath = os.path.join(targetdir, os.path.basename(sourcepath))\n    os.system('/opt/hadoop/latest/bin/hdfs dfs -mkdir -p {}'.format(targetdir))\n    result = os.system(\n        '/opt/hadoop/latest/bin/hdfs dfs -copyFromLocal -f {} {}'.format(sourcepath, hdfspath)\n    )\n    if result != 0:\n        raise OSError('Cannot copyFromLocal {} {} returned {}'.format(sourcepath, hdfspath, result))\n\n"
  },
  {
    "path": "maze/README.md",
    "content": "# Maze task\n\nThis code performs the grid-maze task, in which the agent must locate a reward and then navigate back to it repeatedly (while being randomly relocated each time it finds it).\n\n# Episode 1:\n\n![Animation](AnimBad.gif \"Agent at episode 1\")\n\n# Episode 300000:\n\n![Animation](AnimGood.gif \"Agent at episode 300000\")\n\n\n=======\n# Grid Maze task\n\nThe agent's task is to hit the (invisible) reward location as many times as\npossible within a fixed number time steps. Because the reward location is\nrandomized at the start of each episode, and the agent is randomly teleported\nevery time it hits the reward, the agent must discover and memorize the reward\nlocation for each episode.\n\nThe agent's only inputs consist of a 3x3 neighborhood around the agent's\nlocation, as well as the reward obtained (if any) and the action chosen at the\nprevious time step.\n\nThe outer-loop metal-learning algorithm is Advantage Actor critic. All\nwithin-episode learning occurs through the self-modulated plasticity of network\nconnections.\n\nFor a simpler (but less flexible) implementation of the same task, see the `simplemaze` directory in this repo.\n\n## Visualizations of agent behavior\n\nWe show the behavior of the agent over two successive episodes, after 0 and 200,000 meta-learning iterations. The reward location is indicated only for visualization purposes: it is invisible to the agent.\n\n### Episode 0\n\n![Episode 0](anim0_maze.gif)\n\n### Episode 200,000\n\n![Episode 200,000](anim200K_maze.gif)\n\n\n## Usage\n\n`python3 batch.py  --eplen 200 --hs 100  --lr 1e-4 --l2 0 --addpw 3 --pe 1000 --blossv 0.1 --bent 0.03 --rew 10 --save_every 1000 --rsp 1 --type modplast --da tanh  --nbiter 200002 --msize 13  --wp 0.0 --bs 30 --gc 4.0 --rngseed 0`\n\n`eplen' is the length of an episode, `hs` is the hidden/recurrent layer size, `bs` is batch size and `gc` is gradient clipping.\n`type` can be \"modplast\" (simple neuromodulation), \"modul\" (retroactive modulation), \"plastic\" (non-modulated plasticity) or \"rnn\" (no plasticity at all, plain rnn).\n"
  },
  {
    "path": "maze/anim.py",
    "content": "# python anim.py  --nbiter 1000000 --rule oja --squash 0 --hiddensize 200  --lr 1e-4 --eplen 250 --print_every 200 --save_every 1000  --bentropy 0.1 --blossv .03 --randstart 1 --gr .9 --rp 0 --labsize 11 --rngseed 1 --type plastic\n\n\nimport argparse\nimport pdb \nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\nimport numpy as np\nfrom numpy import random\nimport torch.nn.functional as F\nfrom torch import optim\nfrom torch.optim import lr_scheduler\nimport random\nimport sys\nimport pickle\nimport time\nimport os\nimport OpusHdfsCopy\nfrom OpusHdfsCopy import transferFileToHdfsDir, checkHdfs\nimport platform\n\nimport gridlab\nfrom gridlab import Network\n\nimport numpy as np\nimport matplotlib.pyplot as plt\nimport matplotlib.animation as animation\nimport glob\n\n\n\n\n\nnp.set_printoptions(precision=4)\n\nETA = .02  # Not used\n\nADDINPUT = 4 # 1 input for the previous reward, 1 input for numstep, 1 for whether currently on reward square, 1 \"Bias\" input\n\nNBACTIONS = 4  # U, D, L, R\n\nRFSIZE = 3 # Receptive Field\n\nTOTALNBINPUTS =  RFSIZE * RFSIZE + ADDINPUT + NBACTIONS\n\n\nfig = plt.figure()\nplt.axis('off')\n\ndef train(paramdict):\n    #params = dict(click.get_current_context().params)\n    print(\"Starting training...\")\n    params = {}\n    #params.update(defaultParams)\n    params.update(paramdict)\n    print(\"Passed params: \", params)\n    print(platform.uname())\n    #params['nbsteps'] = params['nbshots'] * ((params['prestime'] + params['interpresdelay']) * params['nbclasses']) + params['prestimetest']  # Total number of steps per episode\n\n\n    # This needs to be the same as in the file generated by gridlab, and thus the command line parameters must be identical\n    suffix = \"grid_\"+\"\".join([str(x)+\"_\" if pair[0] is not 'nbsteps' and pair[0] is not 'rngseed' and pair[0] is not 'save_every' and pair[0] is not 'test_every' else '' for pair in sorted(zip(params.keys(), params.values()), key=lambda x:x[0] ) for x in pair])[:-1] + \"_rngseed_\" + str(params['rngseed'])   # Turning the parameters into a nice suffix for filenames\n\n\n    params['rngseed'] = 3\n    # Initialize random seeds (first two redundant?)\n    print(\"Setting random seeds\")\n    np.random.seed(params['rngseed']); random.seed(params['rngseed']); torch.manual_seed(params['rngseed'])\n    #print(click.get_current_context().params)\n    \n    net = Network(params)\n    net.load_state_dict(torch.load('./tmpWorked/torchmodel_'+suffix + '.txt'))\n\n\n    print (\"Shape of all optimized parameters:\", [x.size() for x in net.parameters()])\n    allsizes = [torch.numel(x.data.cpu()) for x in net.parameters()]\n    print (\"Size (numel) of all optimized elements:\", allsizes)\n    print (\"Total size (numel) of all optimized elements:\", sum(allsizes))\n\n    #total_loss = 0.0\n    print(\"Initializing optimizer\")\n    optimizer = torch.optim.Adam(net.parameters(), lr=1.0*params['lr'], eps=1e-4)\n    #optimizer = torch.optim.SGD(net.parameters(), lr=1.0*params['lr'])\n    #scheduler = torch.optim.lr_scheduler.ExponentialLR(optimizer, params['gamma']) \n    #scheduler = torch.optim.lr_scheduler.StepLR(optimizer, gamma=params['gamma'], step_size=params['steplr'])\n\n    LABSIZE = params['labsize'] \n    lab = np.ones((LABSIZE, LABSIZE))\n    CTR = LABSIZE // 2 \n\n    # Simple cross maze\n    #lab[CTR, 1:LABSIZE-1] = 0\n    #lab[1:LABSIZE-1, CTR] = 0\n\n\n    # Double-T maze\n    #lab[CTR, 1:LABSIZE-1] = 0\n    #lab[1:LABSIZE-1, 1] = 0\n    #lab[1:LABSIZE-1, LABSIZE - 2] = 0\n\n    # Grid maze\n    lab[1:LABSIZE-1, 1:LABSIZE-1].fill(0)\n    for row in range(1, LABSIZE - 1):\n        for col in range(1, LABSIZE - 1):\n            if row % 2 == 0 and col % 2 == 0:\n                lab[row, col] = 1\n    lab[CTR,CTR] = 0 # Not strictly necessary, but perhaps helps loclization by introducing a detectable irregularity in the center\n\n\n\n    all_losses = []\n    all_losses_objective = []\n    all_losses_eval = []\n    all_losses_v = []\n    lossbetweensaves = 0\n    nowtime = time.time()\n    \n    print(\"Starting episodes...\")\n    sys.stdout.flush()\n\n    pos = 0\n    hidden = net.initialZeroState()\n    hebb = net.initialZeroHebb()\n\n\n    # Starting episodes!\n    \n    params['nbiter'] = 1\n    \n    for numiter in range(params['nbiter']):\n        \n        PRINTTRACE = 0\n        if (numiter+1) % (1 + params['print_every']) == 0:\n            PRINTTRACE = 1\n\n        ## Where is the reward square for this episode?\n        \n        #rnd = np.random.randint(0,4) \n        ##if rnd == 0:\n        ##    rposr = 1; rposc = CTR\n        ##elif rnd == 1:\n        ##    rposr = CTR; rposc = 1\n        ##elif rnd == 2:\n        ##    rposr = CTR; rposc = LABSIZE - 2\n        ##elif rnd == 3:\n        ##    rposr = LABSIZE - 2; rposc = CTR\n        #if rnd == 0:\n        #    rposr = 1; rposc = 1 \n        #elif rnd == 1:\n        #    rposr = LABSIZE - 2; rposc = 1\n        #elif rnd == 2:\n        #    rposr = 1; rposc = LABSIZE - 2\n        #elif rnd == 3:\n        #    rposr = LABSIZE - 2; rposc = LABSIZE - 2\n\n        # Note: it doesn't matter if the reward is on the center (see below). All we need is not to put it on a wall or pillar (lab=1)\n        rposr = 0; rposc = 0\n        if params['rp'] == 0:\n            while lab[rposr, rposc] == 1:\n                rposr = np.random.randint(1, LABSIZE - 1)\n                rposc = np.random.randint(1, LABSIZE - 1)\n        elif params['rp'] == 1:\n            while lab[rposr, rposc] == 1 or (rposr != 1 and rposr != LABSIZE -2 and rposc != 1 and rposc != LABSIZE-2):\n                rposr = np.random.randint(1, LABSIZE - 1)\n                rposc = np.random.randint(1, LABSIZE - 1)\n        #print(\"Reward pos:\", rposr, rposc)\n\n        # Agent always starts an episode from the center\n        posc = CTR\n        posr = CTR\n\n        optimizer.zero_grad()\n        loss = 0\n        lossv = 0\n        hidden = net.initialZeroState()\n        hebb = net.initialZeroHebb()\n\n\n        reward = 0.0\n        rewards = []\n        vs = []\n        logprobs = []\n        sumreward = 0.0\n        dist = 0\n\n\n        #params['print_every'] = 10\n\n\n\n        print(\"==========\")\n        print(\"==========\")\n\n        ax_imgs = []\n\n        for numstep in range(params['eplen']):\n            \n            \n            inputsN = np.zeros((1, TOTALNBINPUTS), dtype='float32')\n            inputsN[0, 0:RFSIZE * RFSIZE] = lab[posr - RFSIZE//2:posr + RFSIZE//2 +1, posc - RFSIZE //2:posc + RFSIZE//2 +1].flatten()\n            \n            inputs = torch.from_numpy(inputsN).cuda()\n            # Previous chosen action\n            #inputs[0][numactionchosen] = 1\n            inputs[0][-1] = 1 # Bias neuron\n            inputs[0][-2] = numstep\n            inputs[0][-3] = reward\n            #if rposr == posr and rposc = posc:\n            #    inputs[0][-4] = 1\n            #else:\n            #    inputs[0][-4] = 0\n            \n            # Running the network\n            y, v, hidden, hebb = net(Variable(inputs, requires_grad=False), hidden, hebb)  # y  should output probabilities\n        \n            distrib = torch.distributions.Categorical(y)\n            actionchosen = distrib.sample()  # sample() returns a Pytorch tensor of size 1; this is needed for the backprop below\n            numactionchosen = actionchosen.data[0]    # Turn to scalar\n\n            tgtposc = posc\n            tgtposr = posr\n            if numactionchosen == 0:  # Up\n                tgtposr -= 1\n            elif numactionchosen == 1:  # Down\n                tgtposr += 1\n            elif numactionchosen == 2:  # Left\n                tgtposc -= 1\n            elif numactionchosen == 3:  # Right\n                tgtposc += 1\n            else:\n                raise ValueError(\"Wrong Action\")\n            \n            reward = 0.0\n            if lab[tgtposr][tgtposc] == 1:\n                reward = -.1\n            else:\n                dist += 1\n                posc = tgtposc\n                posr = tgtposr\n\n            # Display the labyrinth\n\n            for numr in range(LABSIZE):\n                s = \"\"\n                for numc in range(LABSIZE):\n                    if posr == numr and posc == numc:\n                        s += \"o\"\n                    elif rposr == numr and rposc == numc:\n                        s += \"X\"\n                    elif lab[numr, numc] == 1:\n                        s += \"#\"\n                    else:\n                        s += \" \"\n                print(s)\n            print(\"\")\n            print(\"\")\n\n            labg = lab.copy()\n            labg[rposr, rposc] = 2\n            labg[posr, posc] = 3\n            fullimg = plt.imshow(labg, animated=True)\n            ax_imgs.append([fullimg])  \n\n\n\n\n            # Did we hit the reward location ? Increase reward and teleport!\n            # Note that it doesn't matter if we teleport onto the reward, since reward hitting is only evaluated after the (obligatory) move\n            if rposr == posr and rposc == posc:\n                reward += 10\n                if params['randstart'] == 1:\n                    posr = np.random.randint(1, LABSIZE - 1)\n                    posc = np.random.randint(1, LABSIZE - 1)\n                    while lab[posr, posc] == 1:\n                        posr = np.random.randint(1, LABSIZE - 1)\n                        posc = np.random.randint(1, LABSIZE - 1)\n                else:\n                    posr = CTR\n                    posc = CTR\n\n\n\n            rewards.append(reward)\n            vs.append(v)\n            sumreward += reward\n            \n            #loss -= distrib.log_prob(actionchosen)  # * reward\n            logprobs.append(distrib.log_prob(actionchosen))\n\n            loss += params['bentropy'] * y.pow(2).sum()   # We want to penalize concentration, i.e. encourage diversity; our version of PyTorch does not have an entropy() function for Distribution. Note: .2 may be too strong, .04 may be too weak. \n\n            #if PRINTTRACE:\n            #    print(\"Probabilities:\", y.data.cpu().numpy(), \"Picked action:\", numactionchosen, \", got reward\", reward)\n\n        R = 0\n        gammaR = params['gr']\n        for numstepb in reversed(range(params['eplen'])) :\n            R = gammaR * R + rewards[numstepb]\n            lossv += (vs[numstepb][0] - R).pow(2) \n            loss -= logprobs[numstepb] * (R - vs[numstepb].data[0][0])  # Not sure if the \"data\" is needed... put it b/c of worry about weird gradient flows\n\n\n\n        if True: #PRINTTRACE:\n            print(\"lossv: \", lossv.data.cpu().numpy()[0])\n            print (\"Total reward for this episode:\", sumreward, \"Dist:\", dist)\n\n        if params['squash'] == 1:\n            if sumreward < 0:\n                sumreward = -np.sqrt(-sumreward)\n            else:\n                sumreward = np.sqrt(sumreward)\n        elif params['squash'] == 0:\n            pass\n        else:\n            raise ValueError(\"Incorrect value for squash parameter\")\n\n        #loss *= sumreward\n        loss += params['blossv'] * lossv\n        loss /= params['eplen']\n        \n        #loss.backward()\n\n        ##for p in net.parameters():\n        ##    p.grad.data.clamp_(-params['clamp'], params['clamp'])\n        #scheduler.step()\n        #optimizer.step()\n\n        #torch.cuda.empty_cache()  \n\n        lossnum = loss.data[0]\n        lossbetweensaves += lossnum\n        if (numiter + 1) % 10 == 0:\n            all_losses_objective.append(lossnum)\n            all_losses_eval.append(sumreward)\n            all_losses_v.append(lossv.data[0])\n        #total_loss  += lossnuma\n\n        anim = animation.ArtistAnimation(fig, ax_imgs, interval=200)\n        anim.save('anim.gif', writer='imagemagick', fps=10)\n\n\n        if (numiter+1) % params['print_every'] == 0:\n\n            print(numiter, \"====\")\n            print(\"Mean loss: \", lossbetweensaves / params['print_every'])\n            lossbetweensaves = 0\n            previoustime = nowtime\n            nowtime = time.time()\n            print(\"Time spent on last\", params['print_every'], \"iters: \", nowtime - previoustime)\n            if params['type'] == 'plastic' or params['type'] == 'lstmplastic':\n                print(\"ETA: \", net.eta.data.cpu().numpy(), \"alpha[0,1]: \", net.alpha.data.cpu().numpy()[0,1], \"w[0,1]: \", net.w.data.cpu().numpy()[0,1] )\n            elif params['type'] == 'rnn':\n                print(\"w[0,1]: \", net.w.data.cpu().numpy()[0,1] )\n\n        if (numiter+1) % params['save_every'] == 0:\n            print(\"Saving files...\")\n#            lossbetweensaves /= params['save_every']\n#            print(\"Average loss over the last\", params['save_every'], \"episodes:\", lossbetweensaves)\n#            print(\"Alternative computation (should be equal):\", np.mean(all_losses_objective[-params['save_every']:]))\n            losslast100 = np.mean(all_losses_objective[-100:])\n            print(\"Average loss over the last 100 episodes:\", losslast100)\n#            # Instability detection; necessary for SELUs, which seem to be divergence-prone\n#            # Note that if we are unlucky enough to have diverged within the last 100 timesteps, this may not save us.\n#            if losslast100 > 2 * lossbetweensavesprev: \n#                print(\"We have diverged ! Restoring last savepoint!\")\n#                net.load_state_dict(torch.load('./torchmodel_'+suffix + '.txt'))\n#            else:\n            print(\"Saving local files...\")\n#            with open('results_'+suffix+'.dat', 'wb') as fo:\n#                    pickle.dump(net.w.data.cpu().numpy(), fo)\n#                    pickle.dump(net.alpha.data.cpu().numpy(), fo)\n#                    pickle.dump(net.eta.data.cpu().numpy(), fo)\n#                    pickle.dump(all_losses, fo)\n#                    pickle.dump(params, fo)\n            #with open('loss_'+suffix+'.txt', 'w') as thefile:\n            #    for item in all_losses_objective:\n            #            thefile.write(\"%s\\n\" % item)\n            #with open('lossv_'+suffix+'.txt', 'w') as thefile:\n            #    for item in all_losses_v:\n            #            thefile.write(\"%s\\n\" % item)\n            #with open('loss_'+suffix+'.txt', 'w') as thefile:\n            #    for item in all_losses_eval:\n            #            thefile.write(\"%s\\n\" % item)\n            #torch.save(net.state_dict(), 'torchmodel_'+suffix+'.txt')\n            #print(\"Saving HDFS files...\")\n            #if checkHdfs():\n            #    print(\"Transfering to HDFS...\")\n            #    #transferFileToHdfsDir('results_'+suffix+'.dat', '/ailabs/tmiconi/omniglot/')\n            #    transferFileToHdfsDir('loss_'+suffix+'.txt', '/ailabs/tmiconi/gridlab/')\n            #    transferFileToHdfsDir('torchmodel_'+suffix+'.txt', '/ailabs/tmiconi/omniglot/')\n            #print(\"Saved!\")\n#            lossbetweensavesprev = lossbetweensaves\n#            lossbetweensaves = 0\n#            sys.stdout.flush()\n#            sys.stderr.flush()\n\n\n\nif __name__ == \"__main__\":\n#defaultParams = {\n#    'type' : 'lstm',\n#    'seqlen' : 200,\n#    'hiddensize': 500,\n#    'activ': 'tanh',\n#    'steplr': 10e9,  # By default, no change in the learning rate\n#    'gamma': .5,  # The annealing factor of learning rate decay for Adam\n#    'imagesize': 31,    \n#    'nbiter': 30000,  \n#    'lr': 1e-4,   \n#    'test_every': 10,\n#    'save_every': 3000,\n#    'rngseed':0\n#}\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"--rngseed\", type=int, help=\"random seed\", default=0)\n    #parser.add_argument(\"--clamp\", type=float, help=\"maximum (absolute value) gradient for clamping\", default=1000000.0)\n    parser.add_argument(\"--bentropy\", type=float, help=\"coefficient for the entropy reward (really Simpson index concentration measure)\", default=0.1)\n    parser.add_argument(\"--blossv\", type=float, help=\"coefficient for value prediction loss\", default=.1)\n    parser.add_argument(\"--labsize\", type=int, help=\"size of the labyrinth; must be odd\", default=7)\n    parser.add_argument(\"--randstart\", type=int, help=\"when hitting reward, should we teleport to random location (1) or center (0)?\", default=0)\n    parser.add_argument(\"--rp\", type=int, help=\"whether the reward should be on the periphery\", default=0)\n    parser.add_argument(\"--squash\", type=int, help=\"squash reward through signed sqrt (1 or 0)\", default=0)\n    #parser.add_argument(\"--nbarms\", type=int, help=\"number of arms\", default=2)\n    #parser.add_argument(\"--nbseq\", type=int, help=\"number of sequences between reinitializations of hidden/Hebbian state and position\", default=3)\n    parser.add_argument(\"--activ\", help=\"activ function ('tanh' or 'selu')\", default='tanh')\n    parser.add_argument(\"--rule\", help=\"learning rule ('hebb' or 'oja')\", default='hebb')\n    parser.add_argument(\"--type\", help=\"network type ('lstm' or 'rnn' or 'plastic')\", default='rnn')\n    parser.add_argument(\"--gr\", type=float, help=\"gammaR: discounting factor for rewards\", default=.99)\n    parser.add_argument(\"--lr\", type=float, help=\"learning rate (Adam optimizer)\", default=1e-4)\n    parser.add_argument(\"--eplen\", type=int, help=\"length of episodes\", default=100)\n    parser.add_argument(\"--hiddensize\", type=int, help=\"size of the recurrent (hidden) layer\", default=100)\n    #parser.add_argument(\"--steplr\", type=int, help=\"duration of each step in the learning rate annealing schedule\", default=100000000)\n    parser.add_argument(\"--steplr\", type=int, help=\"duration of each step in the learning rate annealing schedule\", default=0)\n    parser.add_argument(\"--gamma\", type=float, help=\"learning rate annealing factor\", default=0.3)\n    parser.add_argument(\"--nbiter\", type=int, help=\"number of learning cycles\", default=1000000)\n    parser.add_argument(\"--save_every\", type=int, help=\"number of cycles between successive save points\", default=200)\n    parser.add_argument(\"--print_every\", type=int, help=\"number of cycles between successive printing of information\", default=100)\n    #parser.add_argument(\"--\", type=int, help=\"\", default=1e-4)\n    args = parser.parse_args(); argvars = vars(args); argdict =  { k : argvars[k] for k in argvars if argvars[k] != None }\n    #train()\n    train(argdict)\n\n"
  },
  {
    "path": "maze/animbatch.py",
    "content": "# This code produces animations showing the behavior of an agent for two successive episodes.\n\n# Usage: python animbatch.py --file FILENAME  [--initialize [0/1]]\n\n# FILENAME should be the params_XXX.dat produced by the meta-learning process\n# (batch.py). Make sure that the torchmodel_xxx.dat file is in the same location.\n\n# Optional argument initialize should be set to 1 if you want to ignore the\n# trained parameters and reinitialize the network, equivalent to obtaining the\n# \"generation-0\" network. \n\n\nimport argparse\nimport pdb \nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\nimport numpy as np\nfrom numpy import random\nimport torch.nn.functional as F\nfrom torch import optim\nfrom torch.optim import lr_scheduler\nimport random\nimport sys\nimport pickle\nimport time\nimport os\nimport OpusHdfsCopy\nfrom OpusHdfsCopy import transferFileToHdfsDir, checkHdfs\nimport platform\n\nimport batch\nfrom batch import Network\n\nimport numpy as np\nimport matplotlib.pyplot as plt\nimport matplotlib.animation as animation\nimport glob\n\n\n\n\n\nnp.set_printoptions(precision=4)\n\nETA = .02  # Not used\n\nADDINPUT = 4 # 1 input for the previous reward, 1 input for numstep, 1 for whether currently on reward square, 1 \"Bias\" input\n\nNBACTIONS = 4  # U, D, L, R\n\nRFSIZE = 3 # Receptive Field\n\nTOTALNBINPUTS =  RFSIZE * RFSIZE + ADDINPUT + NBACTIONS\n\n\nfig = plt.figure()\nplt.axis('off')\n\ndef train(paramdict):\n\n    fname = paramdict['file']\n\n    with open(fname, 'rb') as f:\n        params = pickle.load(f)\n\n    #params = dict(click.get_current_context().params)\n    print(\"Passed params: \", params)\n    print(platform.uname())\n    #params['nbsteps'] = params['nbshots'] * ((params['prestime'] + params['interpresdelay']) * params['nbclasses']) + params['prestimetest']  # Total number of steps per episode\n\n    suffix = \"btchFixmod_\"+\"\".join([str(x)+\"_\" if pair[0] != 'nbsteps' and pair[0] != 'rngseed' and pair[0] != 'save_every' and pair[0] != 'test_every' and pair[0] != 'pe' else '' for pair in sorted(zip(params.keys(), params.values()), key=lambda x:x[0] ) for x in pair])[:-1] + \"_rngseed_\" + str(params['rngseed'])   # Turning the parameters into a nice suffix for filenames\n    #suffix = \"modRPDT_\"+\"\".join([str(x)+\"_\" if pair[0] != 'nbsteps' and pair[0] != 'rngseed' and pair[0] != 'save_every' and pair[0] != 'test_every' else '' for pair in sorted(zip(params.keys(), params.values()), key=lambda x:x[0] ) for x in pair])[:-1] + \"_rngseed_\" + str(params['rngseed'])   # Turning the parameters into a nice suffix for filenames\n    print(\"Reconstructed suffix:\", suffix)\n\n\n    params['rsp'] = 1\n\n    #params['rngseed'] = 3\n    # Initialize random seeds (first two redundant?)\n    print(\"Setting random seeds\")\n    np.random.seed(params['rngseed']); random.seed(params['rngseed']); torch.manual_seed(params['rngseed'])\n    #print(click.get_current_context().params)\n    \n    net = Network(params)\n    # YOU MAY NEED TO CHANGE THE DIRECTORY HERE:\n    if paramdict['initialize'] == 0:\n        net.load_state_dict(torch.load('./tmp/torchmodel_'+suffix + '.dat'))\n\n\n    print (\"Shape of all optimized parameters:\", [x.size() for x in net.parameters()])\n    allsizes = [torch.numel(x.data.cpu()) for x in net.parameters()]\n    print (\"Size (numel) of all optimized elements:\", allsizes)\n    print (\"Total size (numel) of all optimized elements:\", sum(allsizes))\n\n    BATCHSIZE = params['bs']\n\n    LABSIZE = params['msize'] \n    lab = np.ones((LABSIZE, LABSIZE))\n    CTR = LABSIZE // 2 \n\n    # Simple cross maze\n    #lab[CTR, 1:LABSIZE-1] = 0\n    #lab[1:LABSIZE-1, CTR] = 0\n\n\n    # Double-T maze\n    #lab[CTR, 1:LABSIZE-1] = 0\n    #lab[1:LABSIZE-1, 1] = 0\n    #lab[1:LABSIZE-1, LABSIZE - 2] = 0\n\n    # Grid maze\n    lab[1:LABSIZE-1, 1:LABSIZE-1].fill(0)\n    for row in range(1, LABSIZE - 1):\n        for col in range(1, LABSIZE - 1):\n            if row % 2 == 0 and col % 2 == 0:\n                lab[row, col] = 1\n    # Not strictly necessary, but cleaner since we start the agent at the\n    # center for each episode; may help loclization in some maze sizes\n    # (including 13 and 9, but not 11) by introducing a detectable irregularity\n    # in the center:\n    lab[CTR,CTR] = 0 \n\n\n\n    all_losses = []\n    all_grad_norms = []\n    all_losses_objective = []\n    all_total_rewards = []\n    all_losses_v = []\n    lossbetweensaves = 0\n    nowtime = time.time()\n    meanrewards = np.zeros((LABSIZE, LABSIZE))\n    meanrewardstmp = np.zeros((LABSIZE, LABSIZE, params['eplen']))\n\n\n    pos = 0\n    hidden = net.initialZeroState()\n    hebb = net.initialZeroHebb()\n    pw = net.initialZeroPlasticWeights()\n\n    #celoss = torch.nn.CrossEntropyLoss() # For supervised learning - not used here\n\n\n\n\n    \n    params['nbiter'] = 3\n    ax_imgs = []\n    \n    for numiter in range(params['nbiter']):\n\n        PRINTTRACE = 0\n        #if (numiter+1) % (1 + params['pe']) == 0:\n        if (numiter+1) % (params['pe']) == 0:\n            PRINTTRACE = 1\n\n        #lab = makemaze.genmaze(size=LABSIZE, nblines=4)\n        #count = np.zeros((LABSIZE, LABSIZE))\n\n        # Select the reward location for this episode - not on a wall!\n        # And not on the center either! (though not sure how useful that restriction is...)\n        # We always start the episode from the center (when hitting reward, we may teleport either to center or to a random location depending on params['rsp'])\n        posr = {}; posc = {}\n        rposr = {}; rposc = {}\n        for nb in range(BATCHSIZE):\n            # Note: it doesn't matter if the reward is on the center (see below). All we need is not to put it on a wall or pillar (lab=1)\n            myrposr = 0; myrposc = 0\n            while lab[myrposr, myrposc] == 1 or (myrposr == CTR and myrposc == CTR):\n                myrposr = np.random.randint(1, LABSIZE - 1)\n                myrposc = np.random.randint(1, LABSIZE - 1)\n            rposr[nb] = myrposr; rposc[nb] = myrposc\n            #print(\"Reward pos:\", rposr, rposc)\n            # Agent always starts an episode from the center\n            posc[nb] = CTR\n            posr[nb] = CTR\n\n        #optimizer.zero_grad()\n        loss = 0\n        lossv = 0\n        hidden = net.initialZeroState()\n        hebb = net.initialZeroHebb()\n        et = net.initialZeroHebb() # Eligibility Trace is identical to Hebbian Trace in shape\n        pw = net.initialZeroPlasticWeights()\n        numactionchosen = 0\n\n\n        reward = np.zeros(BATCHSIZE)\n        sumreward = np.zeros(BATCHSIZE)\n        rewards = []\n        vs = []\n        logprobs = []\n        dist = 0\n        numactionschosen = np.zeros(BATCHSIZE, dtype='int32')\n\n        #reloctime = np.random.randint(params['eplen'] // 4, (3 * params['eplen']) // 4)\n\n\n        #print(\"EPISODE \", numiter)\n        for numstep in range(params['eplen']):\n\n            inputs = np.zeros((BATCHSIZE, TOTALNBINPUTS), dtype='float32') \n        \n            labg = lab.copy()\n            #labg[rposr, rposc] = -1  # The agent can see the reward if it falls within its RF\n            for nb in range(BATCHSIZE):\n                inputs[nb, 0:RFSIZE * RFSIZE] = labg[posr[nb] - RFSIZE//2:posr[nb] + RFSIZE//2 +1, posc[nb] - RFSIZE //2:posc[nb] + RFSIZE//2 +1].flatten() * 1.0\n                \n                # Previous chosen action\n                inputs[nb, RFSIZE * RFSIZE +1] = 1.0 # Bias neuron\n                inputs[nb, RFSIZE * RFSIZE +2] = numstep / params['eplen']\n                #inputs[0, RFSIZE * RFSIZE +3] = 1.0 * reward # Reward from previous time step\n                inputs[nb, RFSIZE * RFSIZE +3] = 1.0 * reward[nb]\n                inputs[nb, RFSIZE * RFSIZE + ADDINPUT + numactionschosen[nb]] = 1\n                #inputs = 100.0 * inputs  # input boosting : Very bad with clamp=0\n            \n            inputsC = torch.from_numpy(inputs).cuda()\n            # Might be better:\n            #if rposr == posr and rposc = posc:\n            #    inputs[0][-4] = 100.0\n            #else:\n            #    inputs[0][-4] = 0\n            \n            # Running the network\n\n            ## Running the network\n            y, v, hidden, hebb, et, pw = net(Variable(inputsC, requires_grad=False), hidden, hebb, et, pw)  # y  should output raw scores, not probas\n\n            # For now:\n            #numactionchosen = np.argmax(y.data[0])\n            # But wait, this is bad, because the network needs to see the\n            # reward signal to guide its own (within-episode) learning... and\n            # argmax might not provide enough exploration for this!\n\n            #ee = np.exp(y.data[0].cpu().numpy())\n            #numactionchosen = np.random.choice(NBNONRESTACTIONS, p = ee / (1e-10 + np.sum(ee)))\n\n            y = F.softmax(y, dim=1)\n            # Must convert y to probas to use this !\n            distrib = torch.distributions.Categorical(y)\n            actionschosen = distrib.sample()  \n            logprobs.append(distrib.log_prob(actionschosen))\n            numactionschosen = actionschosen.data.cpu().numpy()    # Turn to scalar\n            reward = np.zeros(BATCHSIZE, dtype='float32')\n            #if numiter == 7 and numstep == 1:\n            #    pdb.set_trace()\n\n\n            for nb in range(BATCHSIZE):\n                myreward = 0\n                numactionchosen = numactionschosen[nb]\n\n                tgtposc = posc[nb]\n                tgtposr = posr[nb]\n                if numactionchosen == 0:  # Up\n                    tgtposr -= 1\n                elif numactionchosen == 1:  # Down\n                    tgtposr += 1\n                elif numactionchosen == 2:  # Left\n                    tgtposc -= 1\n                elif numactionchosen == 3:  # Right\n                    tgtposc += 1\n                else:\n                    raise ValueError(\"Wrong Action\")\n                \n                reward[nb] = 0.0  # The reward for this step\n                if lab[tgtposr][tgtposc] == 1:\n                    reward[nb] -= params['wp']\n                else:\n                    #dist += 1\n                    posc[nb] = tgtposc\n                    posr[nb] = tgtposr\n\n                # Did we hit the reward location ? Increase reward and teleport!\n                # Note that it doesn't matter if we teleport onto the reward, since reward hitting is only evaluated after the (obligatory) move\n                if rposr[nb] == posr[nb] and rposc[nb] == posc[nb]:\n                    reward[nb] += params['rew']\n                    posr[nb]= np.random.randint(1, LABSIZE - 1)\n                    posc[nb] = np.random.randint(1, LABSIZE - 1)\n                    while lab[posr[nb], posc[nb]] == 1 or (rposr[nb] == posr[nb] and rposc[nb] == posc[nb]):\n                        posr[nb] = np.random.randint(1, LABSIZE - 1)\n                        posc[nb] = np.random.randint(1, LABSIZE - 1)\n\n            rewards.append(reward)\n            vs.append(v)\n            sumreward += reward\n\n            loss += ( params['bent'] * y.pow(2).sum() / BATCHSIZE )  # We want to penalize concentration, i.e. encourage diversity; our version of PyTorch does not have an entropy() function for Distribution. Note: .2 may be too strong, .04 may be too weak. \n            #lossentmean  = .99 * lossentmean + .01 * ( params['bent'] * y.pow(2).sum() / BATCHSIZE ).data[0] # We want to penalize concentration, i.e. encourage diversity; our version of PyTorch does not have an entropy() function for Distribution. Note: .2 may be too strong, .04 may be too weak. \n\n\n            if PRINTTRACE:\n                #print(\"Step \", numstep, \"- GI: \", goodinputs, \", GA: \", goodaction, \" Inputs: \", inputsN, \" - Outputs: \", y.data.cpu().numpy(), \" - action chosen: \", numactionchosen,\n                #        \" - inputsthisstep:\", inputsthisstep, \" - mean abs pw: \", np.mean(np.abs(pw.data.cpu().numpy())), \" -Rew: \", reward)\n                print(\"Step \", numstep, \" Inputs (to 1st in batch): \", inputs[0, :TOTALNBINPUTS], \" - Outputs(1st in batch): \", y[0].data.cpu().numpy(), \" - action chosen(1st in batch): \", numactionschosen[0],\n                        \" - mean abs pw: \", np.mean(np.abs(pw.data.cpu().numpy())), \" -Reward (this step, 1st in batch): \", reward[0])\n\n\n            # Display the labyrinth\n\n            #for numr in range(LABSIZE):\n            #    s = \"\"\n            #    for numc in range(LABSIZE):\n            #        if posr == numr and posc == numc:\n            #            s += \"o\"\n            #        elif rposr == numr and rposc == numc:\n            #            s += \"X\"\n            #        elif lab[numr, numc] == 1:\n            #            s += \"#\"\n            #        else:\n            #            s += \" \"\n            #    print(s)\n            #print(\"\")\n            #print(\"\")\n\n            labg = lab.copy()\n            labg[rposr[0], rposc[0]] = 2\n            labg[posr[0], posc[0]] = 3\n            fullimg = plt.imshow(labg, animated=True)\n            ax_imgs.append([fullimg])  \n\n\n\n\n        # Episode is done, now let's do the actual computations\n\n        R = Variable(torch.zeros(BATCHSIZE).cuda(), requires_grad=False)\n        gammaR = params['gr']\n        for numstepb in reversed(range(params['eplen'])) :\n            R = gammaR * R + Variable(torch.from_numpy(rewards[numstepb]).cuda(), requires_grad=False)\n            ctrR = R - vs[numstepb][0]\n            lossv += ctrR.pow(2).sum() / BATCHSIZE\n            loss -= (logprobs[numstepb] * ctrR.detach()).sum() / BATCHSIZE  # Need to check if detach() is OK\n            #pdb.set_trace()\n\n\n        #elif params['algo'] == 'REI':\n        #    R = sumreward\n        #    baseline = meanrewards[rposr, rposc]\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        loss -= logprobs[numstepb] * (R - baseline)\n        #elif params['algo'] == 'REINOB':\n        #    R = sumreward\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        loss -= logprobs[numstepb] * R\n        #elif params['algo'] == 'REITMP':\n        #    R = 0\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        R = gammaR * R + rewards[numstepb]\n        #        loss -= logprobs[numstepb] * R\n        #elif params['algo'] == 'REITMPB':\n        #    R = 0\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        R = gammaR * R + rewards[numstepb]\n        #        loss -= logprobs[numstepb] * (R - meanrewardstmp[rposr, rposc, numstepb])\n\n        #else:\n        #    raise ValueError(\"Which algo?\")\n\n        #meanrewards[rposr, rposc] = (1.0 - params['nu']) * meanrewards[rposr, rposc] + params['nu'] * sumreward\n        #R = 0\n        #for numstepb in reversed(range(params['eplen'])) :\n        #    R = gammaR * R + rewards[numstepb]\n        #    meanrewardstmp[rposr, rposc, numstepb] = (1.0 - params['nu']) * meanrewardstmp[rposr, rposc, numstepb] + params['nu'] * R\n\n\n        loss += params['blossv'] * lossv\n        loss /= params['eplen']\n\n        if True: #PRINTTRACE:\n            if True: #params['algo'] == 'A3C':\n                print(\"lossv: \", float(lossv))\n            print (\"Total reward for this episode:\", sumreward, \"Dist:\", dist)\n\n        #if numiter > 100:  # Burn-in period for meanrewards\n        #    loss.backward()\n        #    optimizer.step()\n\n        #torch.cuda.empty_cache()\n\n\n    print(\"Saving animation....\")\n    anim = animation.ArtistAnimation(fig, ax_imgs, interval=200)\n    anim.save('anim.gif', writer='imagemagick', fps=10)\n\n\n\nif __name__ == \"__main__\":\n#defaultParams = {\n#    'type' : 'lstm',\n#    'seqlen' : 200,\n#    'hiddensize': 500,\n#    'activ': 'tanh',\n#    'steplr': 10e9,  # By default, no change in the learning rate\n#    'gamma': .5,  # The annealing factor of learning rate decay for Adam\n#    'imagesize': 31,    \n#    'nbiter': 30000,  \n#    'lr': 1e-4,   \n#    'test_every': 10,\n#    'save_every': 3000,\n#    'rngseed':0\n#}\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"--file\", help=\"params file\")\n    parser.add_argument(\"--initialize\", help=\"should we reinitialize the network (1) or keep the trained network (0)?\", default=0)\n    args = parser.parse_args(); argvars = vars(args); argdict =  { k : argvars[k] for k in argvars if argvars[k] != None }\n    train(argdict)\n\n"
  },
  {
    "path": "maze/batch.py",
    "content": "# Backpropamine: differentiable neuromdulated plasticity.\n#\n# Copyright (c) 2018-2019 Uber Technologies, Inc.\n#\n# Licensed under the Uber Non-Commercial License (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at the root directory of this project. \n#\n# See the License file in this repository for the specific language governing \n# permissions and limitations under the License.\n\n\n#This code implements the \"Grid Maze\" task. See Section 4.2 in Miconi et al.\n#ICLR 2019 ( https://openreview.net/pdf?id=r1lrAiA5Ym ), or Section 4.5 in\n#Miconi et al. ICML 2018 ( https://arxiv.org/abs/1804.02464 ).\n\nimport argparse\nimport pdb\nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\nimport numpy as np\nfrom numpy import random\nimport torch.nn.functional as F\nfrom torch import optim\nfrom torch.optim import lr_scheduler\nimport random\nimport sys\nimport pickle\nimport time\nimport os\nimport platform\n##import makemaze\n\nimport numpy as np\n#import matplotlib.pyplot as plt\nimport glob\n\n\n\n\nnp.set_printoptions(precision=4)\nNBDA = 1  # Number of different DA output neurons. At present, the code assumes NBDA=1 and will NOT WORK if you change this.\n\n\nnp.set_printoptions(precision=4)\n\n\nADDINPUT = 4 # 1 inputs for the previous reward, 1 inputs for numstep, 1 unused,  1 \"Bias\" inputs\n\nNBACTIONS = 4  # Up, Down, Left, Right\n\nRFSIZE = 3 # Receptive Field\n\nTOTALNBINPUTS =  RFSIZE * RFSIZE + ADDINPUT + NBACTIONS\n\n##ttype = torch.FloatTensor;\n#ttype = torch.cuda.FloatTensor;\n\n##ttype = torch.FloatTensor;\n#ttype = torch.cuda.FloatTensor;\n\n\nclass Network(nn.Module):\n    def __init__(self, params):\n        super(Network, self).__init__()\n        #self.rule = params['rule']\n        self.type = params['type']\n        self.softmax= torch.nn.functional.softmax\n        #if params['activ'] == 'tanh':\n        self.activ = F.tanh\n        if params['type'] == 'rnn':\n            self.i2h = torch.nn.Linear(TOTALNBINPUTS, params['hs']).cuda()\n            self.w =  torch.nn.Parameter((.01 * torch.rand(params['hs'], params['hs'])).cuda(), requires_grad=True)\n        elif params['type'] == 'modplast':\n            self.i2h = torch.nn.Linear(TOTALNBINPUTS, params['hs']).cuda()\n            self.w =  torch.nn.Parameter((.01 * torch.t(torch.rand(params['hs'], params['hs']))).cuda(), requires_grad=True) \n            self.alpha =  torch.nn.Parameter((.01 * torch.t(torch.rand(params['hs'], params['hs']))).cuda(), requires_grad=True)\n            self.h2DA = torch.nn.Linear(params['hs'], NBDA).cuda()\n        elif params['type'] == 'plastic' :\n            self.i2h = torch.nn.Linear(TOTALNBINPUTS, params['hs']).cuda()\n            self.w =  torch.nn.Parameter((.01 * torch.rand(params['hs'], params['hs'])).cuda(), requires_grad=True)\n            self.alpha =  torch.nn.Parameter((.01 * torch.rand(params['hs'], params['hs'])).cuda(), requires_grad=True)\n            self.eta = torch.nn.Parameter((.01 * torch.ones(1)).cuda(), requires_grad=True)  # Everyone has the same eta\n        elif params['type'] == 'modul' or params['type'] == 'modul2':\n            self.i2h = torch.nn.Linear(TOTALNBINPUTS, params['hs']).cuda()\n            self.w =  torch.nn.Parameter((.01 * torch.rand(params['hs'], params['hs'])).cuda(), requires_grad=True)\n            self.alpha =  torch.nn.Parameter((.01 * torch.rand(params['hs'], params['hs'])).cuda(), requires_grad=True)\n            self.etaet = torch.nn.Parameter((.01 * torch.ones(1)).cuda(), requires_grad=True)  # Everyone has the same etaet\n            self.h2DA = torch.nn.Linear(params['hs'], NBDA).cuda()\n        else:\n            raise ValueError(\"Which network type?\")\n        self.h2o = torch.nn.Linear(params['hs'], NBACTIONS).cuda()\n        self.h2v = torch.nn.Linear(params['hs'], 1).cuda()\n        self.params = params\n\n        # Notice that the vectors are row vectors, and the matrices are transposed wrt the usual order, following apparent pytorch conventions\n        # Each *column* of w targets a single output neuron\n\n    def forward(self, inputs, hidden, hebb, et, pw):\n        BATCHSIZE = self.params['bs']\n        HS = self.params['hs']\n        \n        if self.type == 'rnn':\n            hactiv = self.activ(self.i2h(inputs).view(BATCHSIZE, HS, 1) + torch.matmul(self.w.view(1, HS, HS), \n                hidden.view(BATCHSIZE, HS, 1))).view(BATCHSIZE, HS)\n            hidden = hactiv\n            activout = self.h2o(hactiv)   # Linear! To be softmax'ed outside the function\n            valueout = self.h2v(hactiv)\n            #valueout = 0\n\n        \n        elif self.type == 'plastic':\n            # Each row of w and hebb contains the input weights to a single neuron\n            # hidden = x, hactiv = y\n            hactiv = self.activ(self.i2h(inputs).view(BATCHSIZE, HS, 1) + torch.matmul((self.w + torch.mul(self.alpha, hebb)),\n                            hidden.view(BATCHSIZE, HS, 1))).view(BATCHSIZE, HS)\n            activout = self.h2o(hactiv)  # Pure linear, raw scores - will be softmaxed later\n            valueout = self.h2v(hactiv)\n            \n            deltahebb =  torch.bmm(hactiv.view(BATCHSIZE, HS, 1), hidden.view(BATCHSIZE, 1, HS)) # batched outer product...should it be other way round?\n            \n            if self.params['addpw'] == 3:\n                # Note that there is no decay, even in the Hebb-rule case : additive only!\n                # Hard clamp\n                hebb = torch.clamp( hebb +  self.eta * deltahebb, min=-1.0, max=1.0)\n            elif self.params['addpw'] == 2:\n                # Note that there is no decay, even in the Hebb-rule case : additive only!\n                # Soft clamp\n                hebb = torch.clamp( hebb +  torch.clamp(self.eta * deltahebb, min=0.0) * (1 - hebb) +  torch.clamp(self.eta * deltahebb, max=0.0) * (hebb + 1) , min=-1.0, max=1.0)\n            elif self.params['addpw'] == 1: # Purely additive, tends to make the meta-learning diverge. No decay/clamp.\n                hebb = hebb + self.eta * deltahebb\n            elif self.params['addpw'] == 0:    \n                # We do it the normal way. Note that here, Hebb-rule is decaying.\n                # There is probably a way to make it more efficient. \n                hebb = (1 - self.eta) * hebb + self.eta * deltahebb\n\n            hidden = hactiv\n        \n        elif self.type == 'modplast':\n\n            #Here we compute the same deltahebb for the whole network, and use\n            #the same addpw for the whole network too.  \n\n            # The rows of w and hebb are the inputs weights to a single neuron\n            # hidden = x, hactiv = y\n            hactiv = self.activ(self.i2h(inputs).view(BATCHSIZE, HS, 1) + torch.matmul((self.w + torch.mul(self.alpha, hebb)),\n                            hidden.view(BATCHSIZE, HS, 1))).view(BATCHSIZE, HS)\n            activout = self.h2o(hactiv)  # Pure linear, raw scores - will be softmaxed later\n            valueout = self.h2v(hactiv)\n\n            # Now computing the Hebbian updates...\n            \n            # With batching, DAout is a matrix of size BS x 1 (Really BS x NBDA, but we assume NBDA=1 for now in the deltahebb multiplication below)\n            if self.params['da'] == 'tanh':\n                DAout = F.tanh(self.h2DA(hactiv))\n            elif self.params['da'] == 'sig':\n                DAout = F.sigmoid(self.h2DA(hactiv))\n            elif self.params['da'] == 'lin':\n                DAout =  self.h2DA(hactiv)\n            else:\n                raise ValueError(\"Which transformation for DAout ?\")\n            \n            # deltahebb has shape BS x HS x HS\n            # Each row of hebb contain the input weights to a neuron\n            deltahebb =  torch.bmm(hactiv.view(BATCHSIZE, HS, 1), hidden.view(BATCHSIZE, 1, HS)) # batched outer product...should it be other way round?\n\n\n            if self.params['addpw'] == 3: # Hard clamp, purely additive\n                # Note that we do the same for Hebb and Oja's rule\n                hebb1 = torch.clamp(hebb + DAout.view(BATCHSIZE, 1, 1) * deltahebb, min=-1.0, max=1.0)\n            elif self.params['addpw'] == 2:\n                # Note that there is no decay, even in the Hebb-rule case : additive only!\n                hebb1 = torch.clamp( hebb +  torch.clamp(DAout.view(BATCHSIZE, 1, 1) * deltahebb, min=0.0) * (1 - hebb) +  \n                        torch.clamp(DAout.view(BATCHSIZE, 1, 1)  * deltahebb, max=0.0) * (hebb + 1) , min=-1.0, max=1.0)\n            elif self.params['addpw'] == 1: # Purely additive. This will almost certainly diverge, don't use it! \n                hebb1 = hebb + DAout.view(BATCHSIZE, 1, 1) * deltahebb\n\n            elif self.params['addpw'] == 0:    \n                # We do it the old way. Note that here, Hebb-rule is decaying.\n                # There is probably a way to make it more efficient \n                # NOTE: THIS WILL GO AWRY if DAout is allowed to go outside [0,1]!\n                # Note 2: For Oja's rule, there is no difference between addpw 0 and addpw1\n                    hebb1 = (1 - DAout.view(BATCHSIZE,1,1)) * hebb + DAout.view(BATCHSIZE, 1, 1) * deltahebb\n            else:\n                raise ValueError(\"Which additive form for plastic weights?\")\n            \n            hebb = hebb1\n            hidden = hactiv\n\n        \n        elif self.type == 'modul':\n            \n            # The rows of w and hebb are the inputs weights to a single neuron\n            # hidden = x, hactiv = y\n            hactiv = self.activ(self.i2h(inputs).view(BATCHSIZE, HS, 1) + torch.matmul((self.w + torch.mul(self.alpha, pw)),\n                            hidden.view(BATCHSIZE, HS, 1))).view(BATCHSIZE, HS)\n            activout = self.h2o(hactiv)  # Pure linear, raw scores - will be softmaxed later\n            valueout = self.h2v(hactiv)\n\n            # Now computing the Hebbian updates...\n            \n            # With batching, DAout is a matrix of size BS x 1 (Really BS x NBDA, but we assume NBDA=1 for now in the deltahebb multiplication below)\n            if self.params['da'] == 'tanh':\n                DAout = F.tanh(self.h2DA(hactiv))\n            elif self.params['da'] == 'sig':\n                DAout = F.sigmoid(self.h2DA(hactiv))\n            elif self.params['da'] == 'lin':\n                DAout =  self.h2DA(hactiv)\n            else:\n                raise ValueError(\"Which transformation for DAout ?\")\n\n\n            # We need to select the order of operations; network update, e.t. update, neuromodulated incorporation into plastic weights\n            # One possibility (for now go with this one):\n            #    - computing all outputs from current inputs, including DA\n            #    - incorporating neuromodulated Hebb/eligibility trace into plastic weights\n            #    - computing updated hebb/eligibility traces \n            # Another possibility (modul2):\n            #    - computing all outputs from current inputs, including DA\n            #    - computing updated Hebb/eligibility traces\n            #    - incorporating this modified Hebb into plastic weights through neuromodulation\n\n\n            # In modul2 we would compute deltaet and update et here too; here we compute them later\n            \n            if self.params['addpw'] == 3:\n                # Hard clamp\n                # From modplast/addpw=3: hebb1 = torch.clamp(hebb + DAout.view(BATCHSIZE, 1, 1) * deltahebb, min=-1.0, max=1.0)\n                deltapw = DAout.view(BATCHSIZE,1,1) * et\n                pw1 = torch.clamp(pw + deltapw, min=-1.0, max=1.0)\n            elif self.params['addpw'] == 2:\n                deltapw = DAout.view(BATCHSIZE,1,1) * et\n                # This constrains the pw to stay within [-1, 1] (we could also do that by putting a tanh on top of it, but instead we want pw itself to remain within that range, to avoid large gradients and facilitate movement back to 0)\n                # The outer clamp is there for safety. In theory the expression within that clamp is \"softly\" constrained to stay within [-1, 1], but finite-size effects might throw it off.\n                pw1 = torch.clamp( pw +  torch.clamp(deltapw, min=0.0) * (1 - pw) +  torch.clamp(deltapw, max=0.0) * (pw + 1) , min=-.99999, max=.99999)\n            elif self.params['addpw'] == 1: # Purely additive, tends to make the meta-learning diverge\n                deltapw = DAout.view(BATCHSIZE,1,1) * et\n                pw1 = pw + deltapw\n            elif self.params['addpw'] == 0:    \n                # We do it the old way, with a decay term. \n                # This will FAIL if DAout is allowed to go outside [0,1]\n                # Note: this makes the plastic weights decaying!\n                pw1 = (1 - DAout.view(BATCHSIZE,1,1)) * pw1 + DAout.view(BATCHSIZE, 1, 1) * et\n            \n            pw = pw1\n\n            # Updating the eligibility trace - always a simple decay term. \n            deltaet =  torch.bmm(hactiv.view(BATCHSIZE, HS, 1), hidden.view(BATCHSIZE, 1, HS)) # batched outer product...should it be other way round?\n            et = (1 - self.etaet) * et + self.etaet *  deltaet\n            \n            hidden = hactiv\n\n        else:\n            raise ValueError(\"Must select network type\")\n\n\n\n        return activout, valueout, hidden, hebb, et, pw\n\n\n\n    def initialZeroHebb(self):\n        return Variable(torch.zeros(self.params['bs'], self.params['hs'], self.params['hs']) , requires_grad=False).cuda()\n    def initialZeroPlasticWeights(self):\n        return Variable(torch.zeros(self.params['bs'], self.params['hs'], self.params['hs']) , requires_grad=False).cuda()\n\n    def initialZeroState(self):\n        BATCHSIZE = self.params['bs']\n        return Variable(torch.zeros(BATCHSIZE, self.params['hs']), requires_grad=False ).cuda()\n\n\n\ndef train(paramdict):\n    #params = dict(click.get_current_context().params)\n\n    #TOTALNBINPUTS =  RFSIZE * RFSIZE + ADDINPUT + NBNONRESTACTIONS\n    print(\"Starting training...\")\n    params = {}\n    #params.update(defaultParams)\n    params.update(paramdict)\n    print(\"Passed params: \", params)\n    print(platform.uname())\n    #params['nbsteps'] = params['nbshots'] * ((params['prestime'] + params['interpresdelay']) * params['nbclasses']) + params['prestimetest']  # Total number of steps per episode\n    suffix = \"btchFixmod_\"+\"\".join([str(x)+\"_\" if pair[0] is not 'nbsteps' and pair[0] is not 'rngseed' and pair[0] is not 'save_every' and pair[0] is not 'test_every' and pair[0] is not 'pe' else '' for pair in sorted(zip(params.keys(), params.values()), key=lambda x:x[0] ) for x in pair])[:-1] + \"_rngseed_\" + str(params['rngseed'])   # Turning the parameters into a nice suffix for filenames\n\n    # Initialize random seeds (first two redundant?)\n    print(\"Setting random seeds\")\n    np.random.seed(params['rngseed']); random.seed(params['rngseed']); torch.manual_seed(params['rngseed'])\n    #print(click.get_current_context().params)\n\n    print(\"Initializing network\")\n    net = Network(params)\n    print (\"Shape of all optimized parameters:\", [x.size() for x in net.parameters()])\n    allsizes = [torch.numel(x.data.cpu()) for x in net.parameters()]\n    print (\"Size (numel) of all optimized elements:\", allsizes)\n    print (\"Total size (numel) of all optimized elements:\", sum(allsizes))\n\n    #total_loss = 0.0\n    print(\"Initializing optimizer\")\n    optimizer = torch.optim.Adam(net.parameters(), lr=1.0*params['lr'], eps=1e-4, weight_decay=params['l2'])\n\n    BATCHSIZE = params['bs']\n\n    LABSIZE = params['msize'] \n    lab = np.ones((LABSIZE, LABSIZE))\n    CTR = LABSIZE // 2 \n\n\n    # Grid maze\n    lab[1:LABSIZE-1, 1:LABSIZE-1].fill(0)\n    for row in range(1, LABSIZE - 1):\n        for col in range(1, LABSIZE - 1):\n            if row % 2 == 0 and col % 2 == 0:\n                lab[row, col] = 1\n    # Not strictly necessary, but cleaner since we start the agent at the\n    # center for each episode; may help loclization in some maze sizes\n    # (including 13 and 9, but not 11) by introducing a detectable irregularity\n    # in the center:\n    lab[CTR,CTR] = 0 \n\n\n\n    all_losses = []\n    all_grad_norms = []\n    all_losses_objective = []\n    all_total_rewards = []\n    all_losses_v = []\n    lossbetweensaves = 0\n    nowtime = time.time()\n    meanrewards = np.zeros((LABSIZE, LABSIZE))\n    meanrewardstmp = np.zeros((LABSIZE, LABSIZE, params['eplen']))\n\n\n    pos = 0\n    hidden = net.initialZeroState()\n    hebb = net.initialZeroHebb()\n    pw = net.initialZeroPlasticWeights()\n\n    print(\"Total number of parameters:\", sum([x.numel() for x in net.parameters()]))\n\n\n    print(\"Starting episodes!\")\n\n    for numiter in range(params['nbiter']):\n\n        PRINTTRACE = 0\n        if (numiter+1) % (params['pe']) == 0:\n            PRINTTRACE = 1\n\n\n        # Select the reward location for this episode - not on a wall!\n        # And not on the center either! (though not sure how useful that restriction is...)\n        # We always start the episode from the center (when hitting reward, we may teleport either to center or to a random location depending on params['rsp'])\n        posr = {}; posc = {}\n        rposr = {}; rposc = {}\n        for nb in range(BATCHSIZE):\n            # Note: it doesn't matter if the reward is on the center (see below). All we need is not to put it on a wall or pillar (lab=1)\n            myrposr = 0; myrposc = 0\n            while lab[myrposr, myrposc] == 1 or (myrposr == CTR and myrposc == CTR):\n                myrposr = np.random.randint(1, LABSIZE - 1)\n                myrposc = np.random.randint(1, LABSIZE - 1)\n            rposr[nb] = myrposr; rposc[nb] = myrposc\n            #print(\"Reward pos:\", rposr, rposc)\n            # Agent always starts an episode from the center\n            posc[nb] = CTR\n            posr[nb] = CTR\n\n        optimizer.zero_grad()\n        loss = 0\n        lossv = 0\n        hidden = net.initialZeroState()\n        hebb = net.initialZeroHebb()\n        et = net.initialZeroHebb() # Eligibility Trace is identical to Hebbian Trace in shape\n        pw = net.initialZeroPlasticWeights()\n        numactionchosen = 0\n\n\n        reward = np.zeros(BATCHSIZE)\n        sumreward = np.zeros(BATCHSIZE)\n        rewards = []\n        vs = []\n        logprobs = []\n        dist = 0\n        numactionschosen = np.zeros(BATCHSIZE, dtype='int32')\n\n        for numstep in range(params['eplen']):\n\n\n\n\n            inputs = np.zeros((BATCHSIZE, TOTALNBINPUTS), dtype='float32') \n        \n            labg = lab.copy()\n            for nb in range(BATCHSIZE):\n                inputs[nb, 0:RFSIZE * RFSIZE] = labg[posr[nb] - RFSIZE//2:posr[nb] + RFSIZE//2 +1, posc[nb] - RFSIZE //2:posc[nb] + RFSIZE//2 +1].flatten() * 1.0\n                \n                # Previous chosen action\n                inputs[nb, RFSIZE * RFSIZE +1] = 1.0 # Bias neuron\n                inputs[nb, RFSIZE * RFSIZE +2] = numstep / params['eplen']\n                inputs[nb, RFSIZE * RFSIZE +3] = 1.0 * reward[nb]\n                inputs[nb, RFSIZE * RFSIZE + ADDINPUT + numactionschosen[nb]] = 1\n            \n            inputsC = torch.from_numpy(inputs).cuda()\n            \n            ##### Running the network\n            y, v, hidden, hebb, et, pw = net(Variable(inputsC, requires_grad=False), hidden, hebb, et, pw)  # y  should output raw scores, not probas\n\n\n            y = F.softmax(y, dim=1)     # Now y is conveted to \"proba-like\" quantities\n            distrib = torch.distributions.Categorical(y)\n            actionschosen = distrib.sample()  \n            logprobs.append(distrib.log_prob(actionschosen))\n            numactionschosen = actionschosen.data.cpu().numpy()    # Turn to scalar\n            reward = np.zeros(BATCHSIZE, dtype='float32')\n\n\n            for nb in range(BATCHSIZE):\n                myreward = 0\n                numactionchosen = numactionschosen[nb]\n\n                tgtposc = posc[nb]\n                tgtposr = posr[nb]\n                if numactionchosen == 0:  # Up\n                    tgtposr -= 1\n                elif numactionchosen == 1:  # Down\n                    tgtposr += 1\n                elif numactionchosen == 2:  # Left\n                    tgtposc -= 1\n                elif numactionchosen == 3:  # Right\n                    tgtposc += 1\n                else:\n                    raise ValueError(\"Wrong Action\")\n                \n                reward[nb] = 0.0  # The reward for this step\n                if lab[tgtposr][tgtposc] == 1:\n                    reward[nb] -= params['wp']\n                else:\n                    #dist += 1\n                    posc[nb] = tgtposc\n                    posr[nb] = tgtposr\n\n                # Did we hit the reward location ? Increase reward and teleport!\n                # Note that it doesn't matter if we teleport onto the reward, since reward hitting is only evaluated after the (obligatory) move\n                if rposr[nb] == posr[nb] and rposc[nb] == posc[nb]:\n                    reward[nb] += params['rew']\n                    posr[nb]= np.random.randint(1, LABSIZE - 1)\n                    posc[nb] = np.random.randint(1, LABSIZE - 1)\n                    while lab[posr[nb], posc[nb]] == 1 or (rposr[nb] == posr[nb] and rposc[nb] == posc[nb]):\n                        posr[nb] = np.random.randint(1, LABSIZE - 1)\n                        posc[nb] = np.random.randint(1, LABSIZE - 1)\n\n            rewards.append(reward)\n            vs.append(v)\n            sumreward += reward\n\n\n            # This is the \"entropy bonus\" of A2C, except that since our version\n            # of PyTorch doesn't have an entropy() function, we implement it as\n            # a penalty on the sum of squares instead. The effect is the same:\n            # we want to penalize concentration of probabilities, i.e.\n            # encourage diversity of actions.\n            loss += ( params['bent'] * y.pow(2).sum() / BATCHSIZE )  \n\n\n            if PRINTTRACE:\n                print(\"Step \", numstep, \" Inputs (to 1st in batch): \", inputs[0, :TOTALNBINPUTS], \" - Outputs(1st in batch): \", y[0].data.cpu().numpy(), \" - action chosen(1st in batch): \", numactionschosen[0],\n                        \" - mean abs pw: \", np.mean(np.abs(pw.data.cpu().numpy())), \" -Reward (this step, 1st in batch): \", reward[0])\n\n\n\n        # Episode is done, now let's do the actual computations\n\n\n        R = Variable(torch.zeros(BATCHSIZE).cuda(), requires_grad=False)\n        gammaR = params['gr']\n        for numstepb in reversed(range(params['eplen'])) :\n            R = gammaR * R + Variable(torch.from_numpy(rewards[numstepb]).cuda(), requires_grad=False)\n            ctrR = R - vs[numstepb][0]\n            lossv += ctrR.pow(2).sum() / BATCHSIZE\n            loss -= (logprobs[numstepb] * ctrR.detach()).sum() / BATCHSIZE  # Need to check if detach() is OK\n            #pdb.set_trace()\n\n\n\n        # These are different algorithms (essentially variants of REINFORCE) that do not train a value predictor inside the network... Might be interesting to see if value prediction emerges even if it's not explicitly demanded by the meta-training algorithm!\n\n        #elif params['algo'] == 'REI':\n        #    R = sumreward\n        #    baseline = meanrewards[rposr, rposc]\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        loss -= logprobs[numstepb] * (R - baseline)\n        #elif params['algo'] == 'REINOB':\n        #    R = sumreward\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        loss -= logprobs[numstepb] * R\n        #elif params['algo'] == 'REITMP':\n        #    R = 0\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        R = gammaR * R + rewards[numstepb]\n        #        loss -= logprobs[numstepb] * R\n        #elif params['algo'] == 'REITMPB':\n        #    R = 0\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        R = gammaR * R + rewards[numstepb]\n        #        loss -= logprobs[numstepb] * (R - meanrewardstmp[rposr, rposc, numstepb])\n        #else:\n        #    raise ValueError(\"Which algo?\")\n\n        #meanrewards[rposr, rposc] = (1.0 - params['nu']) * meanrewards[rposr, rposc] + params['nu'] * sumreward\n        #R = 0\n        #for numstepb in reversed(range(params['eplen'])) :\n        #    R = gammaR * R + rewards[numstepb]\n        #    meanrewardstmp[rposr, rposc, numstepb] = (1.0 - params['nu']) * meanrewardstmp[rposr, rposc, numstepb] + params['nu'] * R\n\n        loss += params['blossv'] * lossv\n        loss /= params['eplen']\n\n        if PRINTTRACE:\n            if True: #params['algo'] == 'A3C':\n                #print(\"lossv: \", lossv.data.cpu().numpy()[0])\n                print(\"lossv: \", float(lossv))\n            print (\"Total reward for this episode (all in batch):\", sumreward, \"Dist:\", dist)\n\n        #if params['squash'] == 1:\n        #    if sumreward < 0:\n        #        sumreward = -np.sqrt(-sumreward)\n        #    else:\n        #        sumreward = np.sqrt(sumreward)\n        #elif params['squash'] == 0:\n        #    pass\n        #else:\n        #    raise ValueError(\"Incorrect value for squash parameter\")\n\n        #loss *= sumreward\n\n        #for p in net.parameters():\n        #    p.grad.data.clamp_(-params['clp'], params['clp'])\n        loss.backward()\n        all_grad_norms.append(torch.nn.utils.clip_grad_norm(net.parameters(), params['gc']))\n        if numiter > 100:  # Burn-in period for meanrewards\n            optimizer.step()\n            #pdb.set_trace()\n\n\n        lossnum = float(loss)\n        lossbetweensaves += lossnum\n        all_losses_objective.append(lossnum)\n        all_total_rewards.append(sumreward.mean())\n\n\n        if (numiter+1) % params['pe'] == 0:\n\n            print(numiter, \"====\")\n            print(\"Mean loss: \", lossbetweensaves / params['pe'])\n            lossbetweensaves = 0\n            print(\"Mean reward (across batch and last\", params['pe'], \"eps.): \", np.sum(all_total_rewards[-params['pe']:])/ params['pe'])\n            previoustime = nowtime\n            nowtime = time.time()\n            print(\"Time spent on last\", params['pe'], \"iters: \", nowtime - previoustime)\n            if params['type'] == 'plastic' or params['type'] == 'lstmplastic':\n                print(\"ETA: \", net.eta.data.cpu().numpy(), \"alpha[0,1]: \", net.alpha.data.cpu().numpy()[0,1], \"w[0,1]: \", net.w.data.cpu().numpy()[0,1] )\n            elif params['type'] == 'modul':\n                print(\"etaet: \", float(net.etaet), \" mean-abs pw: \", torch.mean(torch.abs(pw.data)))\n            elif params['type'] == 'rnn':\n                print(\"w[0,1]: \", net.w.data.cpu().numpy()[0,1] )\n\n        if (numiter+1) % params['save_every'] == 0:\n            print(\"Saving files...\")\n            losslast100 = np.mean(all_losses_objective[-100:])\n            print(\"Average loss over the last 100 episodes:\", losslast100)\n            print(\"Saving local files...\")\n            with open('grad_'+suffix+'.txt', 'w') as thefile:\n                for item in all_grad_norms[::10]:\n                        thefile.write(\"%s\\n\" % item)\n            with open('loss_'+suffix+'.txt', 'w') as thefile:\n                for item in all_total_rewards[::10]:\n                        thefile.write(\"%s\\n\" % item)\n            torch.save(net.state_dict(), 'torchmodel_'+suffix+'.dat')\n            with open('params_'+suffix+'.dat', 'wb') as fo:\n                pickle.dump(params, fo)\n            print(\"Done!\")\n            # Uber-only stuff:\n            if os.path.isdir('/mnt/share/tmiconi'):\n                print(\"Transferring to NFS storage...\")\n                for fn in ['params_'+suffix+'.dat', 'loss_'+suffix+'.txt', 'torchmodel_'+suffix+'.dat']:\n                    result = os.system(\n                        'cp {} {}'.format(fn, '/mnt/share/tmiconi/modulmaze/'+fn))\n                print(\"Done!\")\n\n\n\nif __name__ == \"__main__\":\n\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"--rngseed\", type=int, help=\"random seed\", default=0)\n    #parser.add_argument(\"--clamp\", type=float, help=\"maximum (absolute value) gradient for clamping\", default=1000000.0)\n    #parser.add_argument(\"--wp\", type=float, help=\"wall penalty (reward decrement for hitting a wall)\", default=0.1)\n    parser.add_argument(\"--rew\", type=float, help=\"reward value (reward increment for taking correct action after correct stimulus)\", default=1.0)\n    parser.add_argument(\"--wp\", type=float, help=\"penalty for hitting walls\", default=.05)\n    #parser.add_argument(\"--pen\", type=float, help=\"penalty value (reward decrement for taking any non-rest action)\", default=.2)\n    #parser.add_argument(\"--exprew\", type=float, help=\"reward value (reward increment for hitting reward location)\", default=.0)\n    parser.add_argument(\"--bent\", type=float, help=\"coefficient for the entropy reward (really Simpson index concentration measure)\", default=0.03)\n    #parser.add_argument(\"--probarev\", type=float, help=\"probability of reversal (random change) in desired stimulus-response, per time step\", default=0.0)\n    parser.add_argument(\"--blossv\", type=float, help=\"coefficient for value prediction loss\", default=.1)\n    #parser.add_argument(\"--lsize\", type=int, help=\"size of the labyrinth; must be odd\", default=7)\n    #parser.add_argument(\"--rp\", type=int, help=\"whether the reward should be on the periphery\", default=0)\n    #parser.add_argument(\"--squash\", type=int, help=\"squash reward through signed sqrt (1 or 0)\", default=0)\n    #parser.add_argument(\"--nbarms\", type=int, help=\"number of arms\", default=2)\n    #parser.add_argument(\"--nbseq\", type=int, help=\"number of sequences between reinitializations of hidden/Hebbian state and position\", default=3)\n    #parser.add_argument(\"--activ\", help=\"activ function ('tanh' or 'selu')\", default='tanh')\n    #parser.add_argument(\"--algo\", help=\"meta-learning algorithm (A3C or REI)\", default='A3C')\n    #parser.add_argument(\"--rule\", help=\"learning rule ('hebb' or 'oja')\", default='hebb')\n    parser.add_argument(\"--type\", help=\"network type ('lstm' or 'rnn' or 'plastic')\", default='modul')\n    parser.add_argument(\"--msize\", type=int, help=\"size of the maze; must be odd\", default=9)\n    parser.add_argument(\"--da\", help=\"transformation function of DA signal (tanh or sig or lin)\", default='tanh')\n    parser.add_argument(\"--gr\", type=float, help=\"gammaR: discounting factor for rewards\", default=.9)\n    parser.add_argument(\"--gc\", type=float, help=\"gradient norm clipping\", default=1000.0)\n    parser.add_argument(\"--lr\", type=float, help=\"learning rate (Adam optimizer)\", default=1e-4)\n    #parser.add_argument(\"--nu\", type=float, help=\"REINFORCE baseline time constant\", default=.1)\n    #parser.add_argument(\"--samestep\", type=int, help=\"compare stimulus and response in the same step (1) or from successive steps (0) ?\", default=0)\n    #parser.add_argument(\"--nbin\", type=int, help=\"number of possible inputs stimulis\", default=4)\n    #parser.add_argument(\"--modhalf\", type=int, help=\"which half of the recurrent netowkr receives modulation (1 or 2)\", default=1)\n    #parser.add_argument(\"--nbac\", type=int, help=\"number of possible non-rest actions\", default=4)\n    parser.add_argument(\"--rsp\", type=int, help=\"does the agent start each episode from random position (1) or center (0) ?\", default=1)\n    parser.add_argument(\"--addpw\", type=int, help=\"are plastic weights purely additive (1) or forgetting (0) ?\", default=1)\n    #parser.add_argument(\"--clp\", type=int, help=\"inputs clamped (1), fully clamped (2) or through linear layer (0) ?\", default=0)\n    #parser.add_argument(\"--md\", type=int, help=\"maximum delay for reward reception\", default=0)\n    parser.add_argument(\"--eplen\", type=int, help=\"length of episodes\", default=100)\n    #parser.add_argument(\"--exptime\", type=int, help=\"exploration (no reward) time (must be < eplen)\", default=0)\n    parser.add_argument(\"--hs\", type=int, help=\"size of the recurrent (hidden) layer\", default=100)\n    parser.add_argument(\"--bs\", type=int, help=\"batch size\", default=1)\n    parser.add_argument(\"--l2\", type=float, help=\"coefficient of L2 norm (weight decay)\", default=3e-6)\n    #parser.add_argument(\"--steplr\", type=int, help=\"duration of each step in the learning rate annealing schedule\", default=100000000)\n    #parser.add_argument(\"--gamma\", type=float, help=\"learning rate annealing factor\", default=0.3)\n    parser.add_argument(\"--nbiter\", type=int, help=\"number of learning cycles\", default=1000000)\n    parser.add_argument(\"--save_every\", type=int, help=\"number of cycles between successive save points\", default=1000)\n    parser.add_argument(\"--pe\", type=int, help=\"number of cycles between successive printing of information\", default=100)\n    #parser.add_argument(\"--\", type=int, help=\"\", default=1e-4)\n    args = parser.parse_args(); argvars = vars(args); argdict =  { k : argvars[k] for k in argvars if argvars[k] != None }\n    #train()\n    train(argdict)\n\n"
  },
  {
    "path": "maze/makefigure.py",
    "content": "import numpy as np\nimport glob\nimport matplotlib.pyplot as plt\nimport scipy\nfrom scipy import stats\n\n#colorz = ['r', 'b', 'g', 'c', 'm', 'y', 'orange', 'k']\ncolorz = ['r', 'm', 'b', 'c', 'y', 'orange']\n#colorz = ['g', 'g', 'r', 'r', 'b', 'b']\n\nplt.rc('font', size=14)\n\n\ngroupnames = glob.glob('./tmp/loss*tch*gc_4*msize_13*seed_0.txt')  \n#groupnames = glob.glob('./tmp/loss*tch*gc_*msize_17*seed_0.txt')  \n#groupnames = glob.glob('./tmp_prev/loss*addpw_3*md_0*msize_13*rew_1.*seed_0.txt')  \n#groupnames = [x for x in groupnames if not (('pw_0' in x) or ('maz' in x) or  \n#    ('modul2' in x) or ('rsp_0' in x))]  # pw_0 is bad, rsp_0 is a different setting, modul2 has similar results to to modul, maz is very slightly different\n\n#groupnames = glob.glob('./tmp/loss*msize_11*seed_0.txt')  \n#groupnames = glob.glob('./tmp/loss*l2_3e-06*md_4*msize_13*seed_0.txt')  # 11 / hs 100, modul vs modplast, with or without delay (and one w/ 4 cpus i.o. 2)\n#groupnames = glob.glob('./tmp/loss*l2_3e-06*md_0*msize_13*seed_0.txt')  # 11 / hs 100, modul vs modplast, with or without delay (and one w/ 4 cpus i.o. 2)\n#groupnames = glob.glob('./tmp/loss*addpw_2*l2_3e-06*md_0*msize_13*seed_0.txt')  # 11 / hs 100, modul vs modplast, with or without delay (and one w/ 4 cpus i.o. 2)\n\n\n\n#groupnames = glob.glob('./tmp/loss*msize_15*seed_0.txt')  # 15, hs 200, modul vs modplast\n#groupnames = glob.glob('./tmp/loss*hs_100*msize_13*seed_0.txt')  # 13, hs 100, modul vs modplast\n\n#groupnames = glob.glob('./loss*seed_0.txt')  \n#groupnames = glob.glob('./tmp/loss*msize_13*plastic*seed_0.txt')  \n#groupnames = glob.glob('./tmp/loss*msize_9*seed_0.txt')  \n#groupnames = glob.glob('./tmp/loss*modplast*seed_0.txt')  \n\n\n\n#groupnames = glob.glob('./tmp/loss_*new*eplen_250*rngseed_0.txt')  \n#groupnames = glob.glob('./loss_*rngseed_0.txt')  \n\n\n\n# If you can only use 7 runs, smooth the losses within each run to obtain more reliable estimates of performance!\n\n\ndef mavg(x, N):\n  cumsum = np.cumsum(np.insert(x, 0, 0)) \n  return (cumsum[N:] - cumsum[:-N]) / N\n\nplt.ion()\n#plt.figure(figsize=(5,4))  # Smaller figure = relative larger fonts\n#plt.figure(figsize=(9,7))  # Smaller figure = relative larger fonts\nplt.figure(figsize=(7,5))  # Smaller figure = relative larger fonts\n#plt.figure()\n\n\nallmedianls = []\nalllosses = []\nposcol = 0\nmaxminlen = 0\nminminlen = 999999\n\n# Generate labels, and order of curves\nnamez = []\nfor numx, x in enumerate(groupnames):\n    if 'rnn' in x:\n        if '139' in x:\n            myname = 'Non-plastic (139 neurons)'\n        else:\n            myname = 'Non-plastic (100 neurons)'\n    elif 'modul' in x:\n        myname = \"Retroactive modulation (100 neurons)\"\n    elif 'modplast' in x:\n        myname = \"Simple modulation (100 neurons)\"\n    elif 'plastic' in x:\n        myname = \"Non-modulated plasticity (101 neurons)\"\n    #if 'pw_3' in x:\n    #    myname += \" (Hard Clip)\"\n    #else:\n    #    myname += \" (Soft Clip)\"\n    namez.append(myname)\norder = np.argsort(namez)[::-1]\nnamez = [namez[c] for c in order]\ngroupnames = [groupnames[c] for c in order]\n\nfor numgroup, groupname in enumerate(groupnames):\n    if \"batch\"  in groupname:\n        continue\n    #if \"lstm\" not in groupname:\n    #    continue\n    g = groupname[:-6]+\"*\"\n    print(\"====\", groupname)\n    fnames = glob.glob(g)\n    fulllosses=[]\n    losses=[]\n    lgts=[]\n    for fn in fnames:\n        if \"COPY\" in fn:\n            continue\n        if False:\n            #if \"seed_3\" in fn:\n            #    continue\n            #if \"seed_7\" in fn:\n            #    continue\n            if \"seed_3\" in fn:\n                continue\n            #if \"seed_9\" in fn:\n            #    continue\n            #if \"seed_10\" in fn:\n            #    continue\n            #if \"seed_11\" in fn:\n            #    continue\n            #if \"seed_12\" in fn:\n            #    continue\n            #if \"seed_13\" in fn:\n            #    continue\n            #if \"seed_14\" in fn:\n            #    continue\n            #if \"seed_15\" in fn:\n            #    continue\n        z = np.loadtxt(fn)\n        \n        #z = mavg(z, 10)  # For each run, we average the losses over K successive episodes\n\n        z = z[::10] # Decimation - speed things up!\n\n        z = z[:1000] #  Only plot the first 100K episodes (taking into account decimation above and only every 10th episode is stored in the first place)\n\n        print(fn, len(z))\n        if len(z) < 10:\n            print(fn, len(z))\n            continue\n        #z = z[:90]\n        lgts.append(len(z))\n        fulllosses.append(z)\n    minlen = min(lgts)\n    if minlen > maxminlen:\n        maxminlen = minlen\n    if minlen < minminlen:\n        minminlen = minlen\n    print(\"Minlen:\", minlen)\n    #if minlen < 1000:\n    #    continue\n    for z in fulllosses:\n        losses.append(z[:minlen])\n\n    losses = np.array(losses)\n    alllosses.append(losses)\n    \n    meanl = np.mean(losses, axis=0)\n    stdl = np.std(losses, axis=0)\n    #cil = stdl / np.sqrt(losses.shape[0]) * 1.96  # 95% confidence interval - assuming normality\n    #cil = stdl / np.sqrt(losses.shape[0]) * 2.5  # 95% confidence interval - approximated with the t-distribution for 7 d.f.\n\n    medianl = np.median(losses, axis=0)\n    allmedianls.append(medianl)\n    q1l = np.percentile(losses, 25, axis=0) # 1st quartile\n    q3l = np.percentile(losses, 75, axis=0) # 3rd quartile\n    \n    highl = np.max(losses, axis=0)\n    lowl = np.min(losses, axis=0)\n    #highl = meanl+stdl\n    #lowl = meanl-stdl\n\n    xx = range(len(meanl))\n\n    # xticks and labels\n    #xt = range(0, maxminlen, 1000)\n    xt = range(0, 1001, 200)\n    #xt = range(0, len(meanl), 100)\n    #xt = range(0, len(meanl), 1000)\n    #xt = range(0, 10001, 2000)\n    xtl = [str(10 * 10 * i) for i in xt]   # Because of decimation above, and only every 10th loss is recorded in the files\n\n    #plt.plot(mavg(meanl, 100), label=g) #, color='blue')\n    #plt.fill_between(xx, lowl, highl,  alpha=.2)\n    #plt.fill_between(xx, q1l, q3l,  alpha=.1)\n    #plt.plot(meanl) #, color='blue')\n    ####plt.plot(mavg(medianl, 100), label=g) #, color='blue')  # mavg changes the number of points !\n    #plt.plot(mavg(q1l, 100), label=g, alpha=.3) #, color='blue')\n    #plt.plot(mavg(q3l, 100), label=g, alpha=.3) #, color='blue')\n    #plt.fill_between(xx, q1l, q3l,  alpha=.2)\n    #plt.plot(medianl, label=g) #, color='blue')\n   \n    AVGSIZE = 20 # size of the moving average window\n    \n    xlen = len(mavg(q1l, AVGSIZE))\n    #mylabel = g[g.find('type'):]\n    mylabel = namez[numgroup]# g\n    print(numgroup, mylabel)\n    #if numgroup // 8 == 0:\n    #    zestyle = '-'\n    #elif numgroup // 8 == 1:\n    #    zestyle = '--'\n    #elif numgroup // 8 == 2:\n    #    zestyle = ':'\n    if numgroup % 2 == 0:\n        zestyle = '-'\n    else:\n        zestyle = '--'\n    \n    plt.plot(mavg(medianl, AVGSIZE), label=mylabel, color=colorz[poscol % len(colorz)], ls=zestyle, lw=2)  # mavg changes the number of points !\n    plt.fill_between( range(xlen), mavg(q1l, AVGSIZE), mavg(q3l, AVGSIZE),  alpha=.2, color=colorz[poscol % len(colorz)])\n    \n    #xlen = len(mavg(meanl, AVGSIZE))\n    #plt.plot(mavg(meanl, AVGSIZE), label=g, color=colorz[poscol % len(colorz)])  # mavg changes the number of points !\n    #plt.fill_between( range(xlen), mavg(meanl - cil, AVGSIZE), mavg(meanl + cil, AVGSIZE),  alpha=.2, color=colorz[poscol % len(colorz)])\n    \n    poscol += 1\n    \n    #plt.fill_between( range(xlen), mavg(lowl, 100), mavg(highl, 100),  alpha=.2, color=colorz[numgroup % len(colorz)])\n\n    #plt.plot(mavg(losses[0], 1000), label=g, color=colorz[numgroup % len(colorz)])\n    #for curve in losses[1:]:\n    #    plt.plot(mavg(curve, 1000), color=colorz[numgroup % len(colorz)])\n\nps = []\n# Adapt for varying lengths across groups\n#for n in range(0, alllosses[0].shape[1], 3):\n\n#for n in range(0, minminlen):\n#    ps.append(scipy.stats.ranksums(alllosses[0][:,n], alllosses[1][:,n]).pvalue)\n#ps = np.array(ps)\n\na = alllosses\nsignifs = []\nfor n in range(minminlen):\n    signifs.append((scipy.stats.ranksums(a[0][:,n], a[4][:,n])).pvalue)\nsignifs = [x[0] for x in zip(range(minlen), signifs) if x[1] < .05]\n\nplt.plot( np.array(signifs), [20]*len(signifs), '*')\n\n####plt.legend(loc=(.430,.15), fontsize=13)\nplt.legend(loc='upper left', fontsize=13)\n#plt.legend(loc='best', fontsize=13)\n#plt.xlabel('Loss (sum square diff. b/w final output and target)')\nplt.xlabel('Number of Episodes')\nplt.ylabel('Reward')\nplt.xticks(xt, xtl)\n#plt.tight_layout()\n\n\n\n"
  },
  {
    "path": "maze/makemaze.py",
    "content": "# Not used for the current version.\n\nimport numpy as np\n\ndef genmaze(size, nblines):\n    nbiter = 0\n    N = size\n    m = np.zeros((N,N))\n    m[0,:] = 1\n    m[-1,:] = 1\n    m[:,0] = 1\n    m[:, -1]= 1\n\n    MAXLINES = nblines\n    mynblines = 0\n    while True:\n        nbiter += 1\n        if nbiter == 10000:\n            #print(\"Inf. loop in maze gen, resetting map & retrying\") # If that happens too often parameters are probably not good\n            #print(\"IL\") # If that happens too often parameters are probably not good\n            m.fill(0)\n            m[0,:] = 1;   m[-1,:] = 1;  m[:,0] = 1;  m[:, -1]= 1;\n            nbiter = 0\n            mynblines = 0\n        rcol = 1 + np.random.randint(N-1)\n        rrow = 1 + np.random.randint(N-1)\n        if m[rrow, rcol] == 1:\n            continue\n        ori = np.random.randint(2)\n        if ori == 0: # horizontal\n            start = rcol\n            while m[rrow, start] == 0:\n                start -= 1\n            end = rcol\n            while m[rrow, end] == 0:\n                end += 1\n            end -= 1\n            start += 1\n            if end-start < 4:\n                continue\n            if np.sum(m[rrow-1, start:end+1]) > 0 or np.sum(m[rrow+1, start:end+1]) > 0:\n                continue\n            if np.sum(m[rrow-2, start:end+1]) > 0 or np.sum(m[rrow+2, start:end+1]) > 0:\n                continue\n            m[rrow, start:end+1] = 1\n            opening = np.random.randint(start+1, end-1)\n            m[rrow, opening] = 0\n            m[rrow, opening+1] = 0\n            mynblines += 1\n        elif ori == 1: # vertical\n            start = rrow\n            while m[start, rcol] == 0:\n                start -= 1\n            end = rrow\n            while m[end, rcol] == 0:\n                end += 1\n            end -= 1\n            start += 1\n            if end-start < 5:\n                continue\n            if np.sum(m[start:end+1, rcol-1]) > 0 or np.sum(m[start:end+1, rcol+1]) > 0:\n                continue\n            if np.sum(m[start:end+1, rcol-2]) > 0 or np.sum(m[start:end+1, rcol+2]) > 0:\n                continue\n            m[start:end+1, rcol] = 1\n            opening = np.random.randint(start+1, end-1)\n            m[opening, rcol] = 0\n            m[opening+1, rcol] = 0\n            mynblines += 1\n        if mynblines >= MAXLINES:\n            break\n    return m\n\n\n\nif __name__ == '__main__':\n    \n    #M = genmaze(size=50, nblines=8)\n    M = genmaze(size=15, nblines=4)\n    #M = genmaze(size=19, nblines=4)\n    print(M)\n\n\n"
  },
  {
    "path": "maze/maze.py",
    "content": " \n# Differentiable plasticity: maze exploration task.\n#\n# Copyright (c) 2018 Uber Technologies, Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#    http://www.apache.org/licenses/LICENSE-2.0\n#\n#    Unless required by applicable law or agreed to in writing, software\n#    distributed under the License is distributed on an \"AS IS\" BASIS,\n#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n#    See the License for the specific language governing permissions and\n#    limitations under the License.\n\n\n\n# NOTE: Do NOT use the 'lstmplastic' in this code. Instead, look at the\n# awd-lstm-lm directory in the Backpropamine repo\n# (https://github.com/uber-research/backpropamine) for properly implemented\n# plastic LSTMs.\n\n\nimport argparse\nimport pdb \nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\nimport numpy as np\nfrom numpy import random\nimport torch.nn.functional as F\nfrom torch import optim\nfrom torch.optim import lr_scheduler\nimport random\nimport sys\nimport pickle\nimport time\nimport os\nimport platform\n\n# Uber-only:\nimport OpusHdfsCopy\nfrom OpusHdfsCopy import transferFileToHdfsDir, checkHdfs\n \nimport numpy as np\n#import matplotlib.pyplot as plt\nimport glob\n \n \n \nnp.set_printoptions(precision=4)\n \nETA = .02  # Not used\n \nADDINPUT = 4 # 1 input for the previous reward, 1 input for numstep, 1 for whether currently on reward square, 1 \"Bias\" input\n \nNBACTIONS = 4  # U, D, L, R\n \nRFSIZE = 3 # Receptive field size\n \nTOTALNBINPUTS =  RFSIZE * RFSIZE + ADDINPUT + NBACTIONS\n \n\n##ttype = torch.FloatTensor;    # For CPU\nttype = torch.cuda.FloatTensor; # Gor GPU\n\n\nclass Network(nn.Module):\n    def __init__(self, params):\n        super(Network, self).__init__()\n        self.rule = params['rule']\n        self.type = params['type']\n        self.softmax= torch.nn.functional.softmax\n        if params['activ'] == 'tanh':\n            self.activ = F.tanh\n        elif params['activ'] == 'selu':\n            self.activ = F.selu\n        else:\n            raise ValueError('Must choose an activ function')\n        if params['type'] == 'lstm':\n            self.lstm = torch.nn.LSTM(TOTALNBINPUTS, params['hiddensize']).cuda()\n        elif params['type'] == 'rnn':\n            self.i2h = torch.nn.Linear(TOTALNBINPUTS, params['hiddensize']).cuda()\n            self.w =  torch.nn.Parameter((.01 * torch.rand(params['hiddensize'], params['hiddensize'])).cuda(), requires_grad=True)\n        elif params['type'] == 'homo':\n            self.i2h = torch.nn.Linear(TOTALNBINPUTS, params['hiddensize']).cuda()\n            self.w =  torch.nn.Parameter((.01 * torch.rand(params['hiddensize'], params['hiddensize'])).cuda(), requires_grad=True)\n            self.alpha = torch.nn.Parameter((.01 * torch.ones(1)).cuda(), requires_grad=True) # Homogenous plasticity: everyone has the same alpha\n            self.eta = torch.nn.Parameter((.01 * torch.ones(1)).cuda(), requires_grad=True)   # Everyone has the same eta\n        elif params['type'] == 'plastic':\n            self.i2h = torch.nn.Linear(TOTALNBINPUTS, params['hiddensize']).cuda()\n            self.w =  torch.nn.Parameter((.01 * torch.rand(params['hiddensize'], params['hiddensize'])).cuda(), requires_grad=True)\n            self.alpha =  torch.nn.Parameter((.01 * torch.rand(params['hiddensize'], params['hiddensize'])).cuda(), requires_grad=True)\n            self.eta = torch.nn.Parameter((.01 * torch.ones(1)).cuda(), requires_grad=True)  # Everyone has the same eta\n\n\n        elif params['type'] == 'lstmplastic':   # LSTM with plastic connections. HIGHLY EXPERIMENTAL, NOT DEBUGGED - see awd-lstm-lm directory at https://github.com/uber-research/backpropamine instead.\n            self.h2f = torch.nn.Linear(params['hiddensize'], params['hiddensize']).cuda()\n            self.h2i = torch.nn.Linear(params['hiddensize'], params['hiddensize']).cuda()\n            self.h2opt = torch.nn.Linear(params['hiddensize'], params['hiddensize']).cuda()\n            \n            # Plasticity only in the recurrent connections, h to c.\n            #self.h2c = torch.nn.Linear(params['hiddensize'], params['hiddensize']).cuda()  # This is replaced by the plastic connection matrices below\n            self.w =  torch.nn.Parameter((.01 * torch.rand(params['hiddensize'], params['hiddensize'])).cuda(), requires_grad=True)\n            self.alpha =  torch.nn.Parameter((.01 * torch.rand(params['hiddensize'], params['hiddensize'])).cuda(), requires_grad=True)\n            self.eta = torch.nn.Parameter((.01 * torch.ones(1)).cuda(), requires_grad=True)  # Everyone has the same eta\n\n            self.x2f = torch.nn.Linear(TOTALNBINPUTS, params['hiddensize']).cuda()\n            self.x2opt = torch.nn.Linear(TOTALNBINPUTS, params['hiddensize']).cuda()\n            self.x2i = torch.nn.Linear(TOTALNBINPUTS, params['hiddensize']).cuda()\n            self.x2c = torch.nn.Linear(TOTALNBINPUTS, params['hiddensize']).cuda()\n        elif params['type'] == 'lstmmanual':   # An LSTM implemented \"by hand\", to ensure maximum simlarity with the plastic LSTM\n            self.h2f = torch.nn.Linear(params['hiddensize'], params['hiddensize']).cuda()\n            self.h2i = torch.nn.Linear(params['hiddensize'], params['hiddensize']).cuda()\n            self.h2opt = torch.nn.Linear(params['hiddensize'], params['hiddensize']).cuda()\n            self.h2c = torch.nn.Linear(params['hiddensize'], params['hiddensize']).cuda()\n            self.x2f = torch.nn.Linear(TOTALNBINPUTS, params['hiddensize']).cuda()\n            self.x2opt = torch.nn.Linear(TOTALNBINPUTS, params['hiddensize']).cuda()\n            self.x2i = torch.nn.Linear(TOTALNBINPUTS, params['hiddensize']).cuda()\n            self.x2c = torch.nn.Linear(TOTALNBINPUTS, params['hiddensize']).cuda()\n            ##fgt = F.sigmoid(self.x2f(input) + self.h2f(hidden[0]))\n            ##ipt = F.sigmoid(self.x2i(input) + self.h2i(hidden[0]))\n            ##opt = F.sigmoid(self.x2o(input) + self.h2o(hidden[0]))\n            ##cell = torch.mul(fgt, hidden[1]) + torch.mul(ipt, F.tanh(self.x2c(input) + self.h2c(hidden[0])))\n            ##h = torch.mul(opt, cell)\n            ##hidden = (h, cell)\n        else:\n            raise ValueError(\"Which network type?\")\n        self.h2o = torch.nn.Linear(params['hiddensize'], NBACTIONS).cuda()  # From hidden to action output\n        self.h2v = torch.nn.Linear(params['hiddensize'], 1).cuda()          # From hidden to value prediction (for A3C)\n        self.params = params\n        \n        # Notice that the vectors are row vectors, and the matrices are transposed wrt the usual order, following apparent pytorch conventions\n        # Each *column* of w targets a single output neuron\n\n    def forward(self, input, hidden, hebb):\n        if self.type == 'lstm':\n            hactiv, hidden = self.lstm(input.view(1, 1, -1), hidden)  # hactiv is just the h. hidden is the h and the cell state, in a tuple\n            hactiv = hactiv.view(1, -1)\n\n        elif self.type == 'rnn':\n            hactiv = self.activ(self.i2h(input) + hidden.mm(self.w))\n            hidden = hactiv\n\n        # Draft for a \"manual\" lstm:\n        elif self.type== 'lstmmanual':\n            # hidden[0] is the previous h state. hidden[1] is the previous c state\n            fgt = F.sigmoid(self.x2f(input) + self.h2f(hidden[0]))\n            ipt = F.sigmoid(self.x2i(input) + self.h2i(hidden[0]))\n            opt = F.sigmoid(self.x2opt(input) + self.h2opt(hidden[0]))\n            cell = torch.mul(fgt, hidden[1]) + torch.mul(ipt, F.tanh(self.x2c(input) + self.h2c(hidden[0])))\n            hactiv = torch.mul(opt, F.tanh(cell))\n            #pdb.set_trace()\n            hidden = (hactiv, cell)\n            if np.isnan(np.sum(hactiv.data.cpu().numpy())) or np.isnan(np.sum(hidden[1].data.cpu().numpy())) :\n                raise ValueError(\"Nan detected !\")\n\n        elif self.type== 'lstmplastic':\n            fgt = F.sigmoid(self.x2f(input) + self.h2f(hidden[0]))\n            ipt = F.sigmoid(self.x2i(input) + self.h2i(hidden[0]))\n            opt = F.sigmoid(self.x2opt(input) + self.h2opt(hidden[0]))\n            #cell = torch.mul(fgt, hidden[1]) + torch.mul(ipt, F.tanh(self.x2c(input) + self.h2c(hidden[0])))\n            \n            #Need to think what the inputs and outputs should be for the\n            #plasticity. It might be worth introducing an additional stage\n            #consisting of whatever is multiplied by ift and then added to the\n            #cell state, rather than the full cell state.... But we can\n            #experiment both!\n            \n            #cell = torch.mul(fgt, hidden[1]) + torch.mul(ipt, F.tanh(self.x2c(input) + hidden[0].mm(self.w + torch.mul(self.alpha, hebb)))) #  self.h2c(hidden[0])))\n            inputtocell =  F.tanh(self.x2c(input) + hidden[0].mm(self.w + torch.mul(self.alpha, hebb)))\n            cell = torch.mul(fgt, hidden[1]) + torch.mul(ipt, inputtocell) #  self.h2c(hidden[0])))\n\n\n            if self.rule == 'hebb':\n                #hebb = (1 - self.eta) * hebb + self.eta * torch.bmm(hidden[0].unsqueeze(2), cell.unsqueeze(1))[0]\n                hebb = (1 - self.eta) * hebb + self.eta * torch.bmm(hidden[0].unsqueeze(2), inputtocell.unsqueeze(1))[0]\n            elif self.rule == 'oja':\n                # NOTE: NOT SURE ABOUT THE OJA VERSION !!\n                hebb = hebb + self.eta * torch.mul((hidden[0][0].unsqueeze(1) - torch.mul(hebb , inputtocell[0].unsqueeze(0))) , inputtocell[0].unsqueeze(0))  # Oja's rule. Remember that yin, yout are row vectors (dim (1,N)). Also, broadcasting!\n                #hebb = hebb + self.eta * torch.mul((hidden[0].unsqueeze(1) - torch.mul(hebb , hactiv[0].unsqueeze(0))) , hactiv[0].unsqueeze(0))  # Oja's rule. Remember that yin, yout are row vectors (dim (1,N)). Also, broadcasting!\n            hactiv = torch.mul(opt, F.tanh(cell))\n            #pdb.set_trace()\n            hidden = (hactiv, cell)\n            if np.isnan(np.sum(hactiv.data.cpu().numpy())) or np.isnan(np.sum(hidden[1].data.cpu().numpy())) :\n                raise ValueError(\"Nan detected !\")\n\n        elif self.type == 'plastic':\n            hactiv = self.activ(self.i2h(input) + hidden.mm(self.w + torch.mul(self.alpha, hebb)))\n            if self.rule == 'hebb':\n                hebb = (1 - self.eta) * hebb + self.eta * torch.bmm(hidden.unsqueeze(2), hactiv.unsqueeze(1))[0]\n            elif self.rule == 'oja':\n                hebb = hebb + self.eta * torch.mul((hidden[0].unsqueeze(1) - torch.mul(hebb , hactiv[0].unsqueeze(0))) , hactiv[0].unsqueeze(0))  # Oja's rule. Remember that yin, yout are row vectors (dim (1,N)). Also, broadcasting!\n            else:\n                raise ValueError(\"Must specify learning rule ('hebb' or 'oja')\")\n            hidden = hactiv\n        \n        elif self.type == 'homo':\n            hactiv = self.activ(self.i2h(input) + hidden.mm(self.w + self.alpha * hebb))\n            if self.rule == 'hebb':\n                hebb = (1 - self.eta) * hebb + self.eta * torch.bmm(hidden.unsqueeze(2), hactiv.unsqueeze(1))[0]\n            elif self.rule == 'oja':\n                hebb = hebb + self.eta * torch.mul((hidden[0].unsqueeze(1) - torch.mul(hebb , hactiv[0].unsqueeze(0))) , hactiv[0].unsqueeze(0))  # Oja's rule. Remember that yin, yout are row vectors (dim (1,N)). Also, broadcasting!\n            else:\n                raise ValueError(\"Must specify learning rule ('hebb' or 'oja')\")\n            hidden = hactiv\n        \n        activout = self.softmax(self.h2o(hactiv))   # Action selection\n        valueout = self.h2v(hactiv)                 # Value prediction (for A3C)\n\n        return activout, valueout, hidden, hebb\n\n    def initialZeroHebb(self):\n        return Variable(torch.zeros(self.params['hiddensize'], self.params['hiddensize']) , requires_grad=False).cuda()\n\n    def initialZeroState(self):\n        if self.params['type'] == 'lstm':\n            return (Variable(torch.zeros(1, 1, self.params['hiddensize']), requires_grad=False).cuda() , Variable(torch.zeros(1, 1, self.params['hiddensize']), requires_grad=False ).cuda() )\n        elif self.params['type'] == 'lstmmanual' or self.params['type'] == 'lstmplastic':\n            return (Variable(torch.zeros(1, self.params['hiddensize']), requires_grad=False).cuda() , Variable(torch.zeros(1, self.params['hiddensize']), requires_grad=False ).cuda() )\n        elif self.params['type'] == 'rnn' or self.params['type'] == 'plastic' or self.params['type'] == 'homo':\n            return Variable(torch.zeros(1, self.params['hiddensize']), requires_grad=False ).cuda() \n        else:\n            raise ValueError(\"Which type?\")\n\n\n\ndef train(paramdict):\n    #params = dict(click.get_current_context().params)\n    print(\"Starting training...\")\n    params = {}\n    #params.update(defaultParams)\n    params.update(paramdict)\n    print(\"Passed params: \", params)\n    print(platform.uname())\n    #params['nbsteps'] = params['nbshots'] * ((params['prestime'] + params['interpresdelay']) * params['nbclasses']) + params['prestimetest']  # Total number of steps per episode\n    suffix = \"maze_\"+\"\".join([str(x)+\"_\" if pair[0] is not 'nbsteps' and pair[0] is not 'rngseed' and pair[0] is not 'save_every' and pair[0] is not 'test_every' else '' for pair in sorted(zip(params.keys(), params.values()), key=lambda x:x[0] ) for x in pair])[:-1] + \"_rngseed_\" + str(params['rngseed'])   # Turning the parameters into a nice suffix for filenames\n\n    # Initialize random seeds (first two redundant?)\n    print(\"Setting random seeds\")\n    np.random.seed(params['rngseed']); random.seed(params['rngseed']); torch.manual_seed(params['rngseed'])\n\n    print(\"Initializing network\")\n    net = Network(params)\n    print (\"Shape of all optimized parameters:\", [x.size() for x in net.parameters()])\n    allsizes = [torch.numel(x.data.cpu()) for x in net.parameters()]\n    print (\"Size (numel) of all optimized elements:\", allsizes)\n    print (\"Total size (numel) of all optimized elements:\", sum(allsizes))\n\n    print(\"Initializing optimizer\")\n    optimizer = torch.optim.Adam(net.parameters(), lr=1.0*params['lr'], eps=1e-4)\n    #scheduler = torch.optim.lr_scheduler.StepLR(optimizer, gamma=params['gamma'], step_size=params['steplr'])\n\n    LABSIZE = params['labsize'] \n    lab = np.ones((LABSIZE, LABSIZE))\n    CTR = LABSIZE // 2 \n\n    # Simple cross maze\n    #lab[CTR, 1:LABSIZE-1] = 0\n    #lab[1:LABSIZE-1, CTR] = 0\n\n\n    # Double-T maze\n    #lab[CTR, 1:LABSIZE-1] = 0\n    #lab[1:LABSIZE-1, 1] = 0\n    #lab[1:LABSIZE-1, LABSIZE - 2] = 0\n\n    # Grid maze\n    lab[1:LABSIZE-1, 1:LABSIZE-1].fill(0)\n    for row in range(1, LABSIZE - 1):\n        for col in range(1, LABSIZE - 1):\n            if row % 2 == 0 and col % 2 == 0:\n                lab[row, col] = 1\n    lab[CTR,CTR] = 0 # Not really necessary, but nicer to not start on a wall, and perhaps helps localization by introducing a detectable irregularity in the center?\n\n\n\n    all_losses = []\n    all_losses_objective = []\n    all_losses_eval = []\n    all_losses_v = []\n    lossbetweensaves = 0\n    nowtime = time.time()\n    \n    print(\"Starting episodes...\")\n    sys.stdout.flush()\n\n    pos = 0\n    hidden = net.initialZeroState()\n    hebb = net.initialZeroHebb()\n\n\n    # Starting episodes!\n    \n    for numiter in range(params['nbiter']):\n        \n        PRINTTRACE = 0\n        if (numiter+1) % (1 + params['print_every']) == 0:\n            PRINTTRACE = 1\n\n        # Note: it doesn't matter if the reward is on the center (reward is only computed after an action is taken). All we need is not to put it on a wall or pillar (lab=1)\n        rposr = 0; rposc = 0\n        if params['rp'] == 0:   \n            # If we want to constrain the reward to fall on the periphery of the maze\n            while lab[rposr, rposc] == 1:\n                rposr = np.random.randint(1, LABSIZE - 1)\n                rposc = np.random.randint(1, LABSIZE - 1)\n        elif params['rp'] == 1:\n            while lab[rposr, rposc] == 1 or (rposr != 1 and rposr != LABSIZE -2 and rposc != 1 and rposc != LABSIZE-2):\n                rposr = np.random.randint(1, LABSIZE - 1)\n                rposc = np.random.randint(1, LABSIZE - 1)\n        #print(\"Reward pos:\", rposr, rposc)\n\n        # Agent always starts an episode from the center\n        posc = CTR\n        posr = CTR\n\n        optimizer.zero_grad()\n        loss = 0\n        lossv = 0\n        hidden = net.initialZeroState()\n        hebb = net.initialZeroHebb()\n\n\n        reward = 0.0\n        rewards = []\n        vs = []\n        logprobs = []\n        sumreward = 0.0\n        dist = 0\n\n        for numstep in range(params['eplen']):\n            \n            \n            inputsN = np.zeros((1, TOTALNBINPUTS), dtype='float32')\n            inputsN[0, 0:RFSIZE * RFSIZE] = lab[posr - RFSIZE//2:posr + RFSIZE//2 +1, posc - RFSIZE //2:posc + RFSIZE//2 +1].flatten()\n            \n            inputs = torch.from_numpy(inputsN).cuda()\n            # Previous chosen action\n            #inputs[0][numactionchosen] = 1\n            inputs[0][-1] = 1 # Bias neuron\n            inputs[0][-2] = numstep\n            inputs[0][-3] = reward\n            \n            # Running the network\n            y, v, hidden, hebb = net(Variable(inputs, requires_grad=False), hidden, hebb)  # y  should output probabilities; v is the value prediction \n        \n            distrib = torch.distributions.Categorical(y)\n            actionchosen = distrib.sample()  # sample() returns a Pytorch tensor of size 1; this is needed for the backprop below\n            numactionchosen = actionchosen.data[0]    # Turn to scalar\n\n            # Target position, based on the selected action\n            tgtposc = posc\n            tgtposr = posr\n            if numactionchosen == 0:  # Up\n                tgtposr -= 1\n            elif numactionchosen == 1:  # Down\n                tgtposr += 1\n            elif numactionchosen == 2:  # Left\n                tgtposc -= 1\n            elif numactionchosen == 3:  # Right\n                tgtposc += 1\n            else:\n                raise ValueError(\"Wrong Action\")\n            \n            reward = 0.0\n            if lab[tgtposr][tgtposc] == 1:\n                reward = -.1\n            else:\n                dist += 1\n                posc = tgtposc\n                posr = tgtposr\n\n            # Did we hit the reward location ? Increase reward and teleport!\n            # Note that it doesn't matter if we teleport onto the reward, since reward hitting is only evaluated after the (obligatory) move\n            if rposr == posr and rposc == posc:\n                reward += 10\n                if params['randstart'] == 1:\n                    posr = np.random.randint(1, LABSIZE - 1)\n                    posc = np.random.randint(1, LABSIZE - 1)\n                    while lab[posr, posc] == 1:\n                        posr = np.random.randint(1, LABSIZE - 1)\n                        posc = np.random.randint(1, LABSIZE - 1)\n                else:\n                    posr = CTR\n                    posc = CTR\n\n\n            # Store the obtained reward, value prediction, and log-probabilities, for this time step\n            rewards.append(reward)\n            sumreward += reward\n            vs.append(v)\n            logprobs.append(distrib.log_prob(actionchosen))\n\n            # A3C/A2C has an entropy reward on the output probabilities, to\n            # encourage exploration. Our version of PyTorch does not have an\n            # entropy() function for Distribution, so we use a penalty on the\n            # sum of squares instead, which has the same basic property\n            # (discourages concentration). It really does help!\n            loss += params['bentropy'] * y.pow(2).sum()   \n\n            #if PRINTTRACE:\n            #    print(\"Probabilities:\", y.data.cpu().numpy(), \"Picked action:\", numactionchosen, \", got reward\", reward)\n\n\n        # Do the A2C ! (essentially copied from V. Mnih, https://arxiv.org/abs/1602.01783, Algorithm S3)\n        R = 0\n        gammaR = params['gr']\n        for numstepb in reversed(range(params['eplen'])) :\n            R = gammaR * R + rewards[numstepb]\n            lossv += (vs[numstepb][0] - R).pow(2) \n            loss -= logprobs[numstepb] * (R - vs[numstepb].data[0][0])  \n\n\n\n        if PRINTTRACE:\n            print(\"lossv: \", lossv.data.cpu().numpy()[0])\n            print (\"Total reward for this episode:\", sumreward, \"Dist:\", dist)\n\n        # Do we want to squash rewards for stabilization? \n        if params['squash'] == 1:\n            if sumreward < 0:\n                sumreward = -np.sqrt(-sumreward)\n            else:\n                sumreward = np.sqrt(sumreward)\n        elif params['squash'] == 0:\n            pass\n        else:\n            raise ValueError(\"Incorrect value for squash parameter\")\n\n        # Mixing the reward loss and the value-prediction loss\n        loss += params['blossv'] * lossv\n        loss /= params['eplen']\n        loss.backward()\n\n        #scheduler.step()\n        optimizer.step()\n        #torch.cuda.empty_cache()  \n\n        lossnum = loss.data[0]\n        lossbetweensaves += lossnum\n        if (numiter + 1) % 10 == 0:\n            all_losses_objective.append(lossnum)\n            all_losses_eval.append(sumreward)\n            all_losses_v.append(lossv.data[0])\n\n\n\n        # Algorithm done. Now print statistics and save files.\n\n        if (numiter+1) % params['print_every'] == 0:\n\n            print(numiter, \"====\")\n            print(\"Mean loss: \", lossbetweensaves / params['print_every'])\n            lossbetweensaves = 0\n            previoustime = nowtime\n            nowtime = time.time()\n            print(\"Time spent on last\", params['print_every'], \"iters: \", nowtime - previoustime)\n            if params['type'] == 'plastic' or params['type'] == 'lstmplastic':\n                print(\"ETA: \", net.eta.data.cpu().numpy(), \"alpha[0,1]: \", net.alpha.data.cpu().numpy()[0,1], \"w[0,1]: \", net.w.data.cpu().numpy()[0,1] )\n            elif params['type'] == 'rnn':\n                print(\"w[0,1]: \", net.w.data.cpu().numpy()[0,1] )\n\n        if (numiter+1) % params['save_every'] == 0:\n            print(\"Saving files...\")\n            losslast100 = np.mean(all_losses_objective[-100:])\n            print(\"Average loss over the last 100 episodes:\", losslast100)\n            print(\"Saving local files...\")\n            with open('params_'+suffix+'.dat', 'wb') as fo:\n                    pickle.dump(params, fo)\n            with open('lossv_'+suffix+'.txt', 'w') as thefile:\n                for item in all_losses_v:\n                        thefile.write(\"%s\\n\" % item)\n            with open('loss_'+suffix+'.txt', 'w') as thefile:\n                for item in all_losses_eval:\n                        thefile.write(\"%s\\n\" % item)\n            torch.save(net.state_dict(), 'torchmodel_'+suffix+'.dat')\n            # Uber-only\n            print(\"Saving HDFS files...\")\n            if checkHdfs():\n                print(\"Transfering to HDFS...\")\n                transferFileToHdfsDir('loss_'+suffix+'.txt', '/ailabs/tmiconi/gridlab/')\n                transferFileToHdfsDir('torchmodel_'+suffix+'.dat', '/ailabs/tmiconi/gridlab/')\n                transferFileToHdfsDir('params_'+suffix+'.dat', '/ailabs/tmiconi/gridlab/')\n            #print(\"Saved!\")\n\n\n\nif __name__ == \"__main__\":\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"--rngseed\", type=int, help=\"random seed\", default=0)\n    #parser.add_argument(\"--clamp\", type=float, help=\"maximum (absolute value) gradient for clamping\", default=1000000.0)\n    parser.add_argument(\"--bentropy\", type=float, help=\"coefficient for the A2C 'entropy' reward (really Simpson index concentration measure)\", default=0.1)\n    parser.add_argument(\"--blossv\", type=float, help=\"coefficient for the A2C value prediction loss\", default=.03)\n    parser.add_argument(\"--labsize\", type=int, help=\"size of the labyrinth; must be odd\", default=9)\n    parser.add_argument(\"--randstart\", type=int, help=\"when hitting reward, should we teleport to random location (1) or center (0)?\", default=1)\n    parser.add_argument(\"--rp\", type=int, help=\"whether the reward should be on the periphery\", default=0)\n    parser.add_argument(\"--squash\", type=int, help=\"squash reward through signed sqrt (1 or 0)\", default=0)\n    #parser.add_argument(\"--nbarms\", type=int, help=\"number of arms\", default=2)\n    #parser.add_argument(\"--nbseq\", type=int, help=\"number of sequences between reinitializations of hidden/Hebbian state and position\", default=3)\n    parser.add_argument(\"--activ\", help=\"activ function ('tanh' or 'selu')\", default='tanh')\n    parser.add_argument(\"--rule\", help=\"learning rule ('hebb' or 'oja')\", default='oja')\n    parser.add_argument(\"--type\", help=\"network type ('rnn' or 'plastic')\", default='rnn')\n    parser.add_argument(\"--gr\", type=float, help=\"gammaR: discounting factor for rewards\", default=.9)\n    parser.add_argument(\"--lr\", type=float, help=\"learning rate (Adam optimizer)\", default=1e-4)\n    parser.add_argument(\"--eplen\", type=int, help=\"length of episodes\", default=250)\n    parser.add_argument(\"--hiddensize\", type=int, help=\"size of the recurrent (hidden) layer\", default=200)\n    #parser.add_argument(\"--steplr\", type=int, help=\"duration of each step in the learning rate annealing schedule\", default=100000000)\n    #parser.add_argument(\"--gamma\", type=float, help=\"learning rate annealing factor\", default=0.3)\n    parser.add_argument(\"--nbiter\", type=int, help=\"number of learning cycles\", default=1000000)\n    parser.add_argument(\"--save_every\", type=int, help=\"number of cycles between successive save points\", default=200)\n    parser.add_argument(\"--print_every\", type=int, help=\"number of cycles between successive printing of information\", default=100)\n    #parser.add_argument(\"--\", type=int, help=\"\", default=1e-4)\n    args = parser.parse_args(); argvars = vars(args); argdict =  { k : argvars[k] for k in argvars if argvars[k] != None }\n    #train()\n    train(argdict)\n\n"
  },
  {
    "path": "maze/opus.docker",
    "content": "#tmiconi_rl\n#latest\n#.\n\n\n#FROM localhost:5000/opus-deep-learning:master-test-2017_9_7_20_56_10\nFROM localhost:5000/opus-deep-learning:master-test-2018_1_3_0_38_14\n\n\n\n\nRUN mkdir /home/work\n\nCOPY ./*.py /home/work/\n\nENV LC_ALL C.UTF-8\nENV  LANG C.UTF-8\n\n"
  },
  {
    "path": "maze/opus.docker.old",
    "content": "#tmiconi_rl\n#latest\n#.\n\n\n#FROM localhost:5000/opus-deep-learning:master-test-2017_9_7_20_56_10\nFROM opus-deep-learning-py3:master-prod-2019_2_5_4_54_39\n#FROM opus-deep-learning:master--2018_9_20_18_2_31\n\n\n\n\nRUN mkdir /home/work\n\nCOPY ./*.py /home/work/\n\nENV LC_ALL C.UTF-8\nENV  LANG C.UTF-8\n\n"
  },
  {
    "path": "maze/plotfigure.py",
    "content": "import numpy as np\nimport glob\nimport matplotlib.pyplot as plt\nimport scipy\nfrom scipy import stats\n\ncolorz = ['r', 'b', 'g', 'c', 'm', 'y', 'orange', 'k']\n\n\ngroupnames = glob.glob('./tmpWorked/loss_*absize_11_*rngseed_1.txt')  \n#groupnames = glob.glob('./tmp8/loss_*eplen_251*densize_200*absize_11_*ndstart_1*rngseed_1.txt')  \n#groupnames = glob.glob('./tmp8/loss_*eplen_251*densize_200*absize_11_*ndstart_1*rngseed_1.txt')  \n\n\n#groupnames = glob.glob('./tmp/loss_*new*eplen_251*rngseed_0.txt')  \n#groupnames = glob.glob('./tmp/loss_*new*eplen_250*rngseed_0.txt')  \n\nplt.rc('font', size=14)\n\n\n# If you can only use 7 runs, smooth the losses within each run to obtain more reliable estimates of performance!\n\n\ndef mavg(x, N):\n  cumsum = np.cumsum(np.insert(x, 0, 0)) \n  return (cumsum[N:] - cumsum[:-N]) / N\n\nplt.ion()\n#plt.figure(figsize=(5,4))  # Smaller figure = relative larger fonts\nplt.figure()\n\nallmedianls = []\nalllosses = []\nposcol = 0\nminminlen = 999999\nfor numgroup, groupname in enumerate(groupnames):\n    if \"lstm\" in groupname:\n        continue\n    g = groupname[:-6]+\"*\"\n    print(\"====\", groupname)\n    fnames = glob.glob(g)\n    fulllosses=[]\n    losses=[]\n    lgts=[]\n    for fn in fnames:\n        if False:\n            if \"seed_7\" in fn:\n                continue\n            if \"seed_8\" in fn:\n                continue\n\n\n        z = np.loadtxt(fn)\n        \n        z = mavg(z, 20)  # For each run, we average the losses over K successive episodes - otherwise figure is unreadable due to noise!\n\n        z = z[::10] # Decimation - speed things up!\n        z = mavg(z, 10)\n\n        #z = z[:5001]\n        \n        if len(z) < 9000:\n            print(fn)\n            continue\n        #z = z[:90]\n        lgts.append(len(z))\n        fulllosses.append(z)\n    minlen = min(lgts)\n    if minlen < minminlen:\n        minminlen = minlen\n    print(minlen)\n    #if minlen < 1000:\n    #    continue\n    for z in fulllosses:\n        losses.append(z[:minlen])\n\n    losses = np.array(losses)\n    alllosses.append(losses)\n    \n    meanl = np.mean(losses, axis=0)\n    stdl = np.std(losses, axis=0)\n    #cil = stdl / np.sqrt(losses.shape[0]) * 1.96  # 95% confidence interval - assuming normality\n    cil = stdl / np.sqrt(losses.shape[0]) * 2.5  # 95% confidence interval - approximated with the t-distribution for 7 d.f. (?)\n\n    medianl = np.median(losses, axis=0)\n    allmedianls.append(medianl)\n    q1l = np.percentile(losses, 25, axis=0)\n    q3l = np.percentile(losses, 75, axis=0)\n    \n    highl = np.max(losses, axis=0)\n    lowl = np.min(losses, axis=0)\n    #highl = meanl+stdl\n    #lowl = meanl-stdl\n\n    xx = range(len(meanl))\n\n    # xticks and labels\n    #xt = range(0, len(meanl), 2000)\n    xt = range(0, 10001, 2000)\n    xtl = [str(10 * 10 * i) for i in xt]   # Because of decimation above, and only every 10th loss is recorded in the files\n\n    if \"plastic\" in groupname:\n        lbl = \"Plastic\"\n    if \"homo\" in groupname:\n        lbl = \"Homogenous Plastic\"\n    elif \"rnn\" in groupname:\n        lbl = \"Non-plastic\"\n\n    #plt.plot(mavg(meanl, 100), label=g) #, color='blue')\n    #plt.fill_between(xx, lowl, highl,  alpha=.2)\n    #plt.fill_between(xx, q1l, q3l,  alpha=.1)\n    #plt.plot(meanl) #, color='blue')\n    ####plt.plot(mavg(medianl, 100), label=g) #, color='blue')  # mavg changes the number of points !\n    #plt.plot(mavg(q1l, 100), label=g, alpha=.3) #, color='blue')\n    #plt.plot(mavg(q3l, 100), label=g, alpha=.3) #, color='blue')\n    #plt.fill_between(xx, q1l, q3l,  alpha=.2)\n    #plt.plot(medianl, label=g) #, color='blue')\n   \n    AVGSIZE = 1\n    \n    xlen = len(mavg(q1l, AVGSIZE))\n    plt.plot(mavg(medianl, AVGSIZE), color=colorz[poscol % len(colorz)], label=lbl)  # mavg changes the number of points !\n    plt.fill_between( range(xlen), mavg(q1l, AVGSIZE), mavg(q3l, AVGSIZE),  alpha=.2, color=colorz[poscol % len(colorz)])\n    \n    #xlen = len(mavg(meanl, AVGSIZE))\n    #plt.plot(mavg(meanl, AVGSIZE), label=g, color=colorz[poscol % len(colorz)])  # mavg changes the number of points !\n    #plt.fill_between( range(xlen), mavg(meanl - cil, AVGSIZE), mavg(meanl + cil, AVGSIZE),  alpha=.2, color=colorz[poscol % len(colorz)])\n    \n    poscol += 1\n    \n    #plt.fill_between( range(xlen), mavg(lowl, 100), mavg(highl, 100),  alpha=.2, color=colorz[numgroup % len(colorz)])\n\n    #plt.plot(mavg(losses[0], 1000), label=g, color=colorz[numgroup % len(colorz)])\n    #for curve in losses[1:]:\n    #    plt.plot(mavg(curve, 1000), color=colorz[numgroup % len(colorz)])\n\nps = []\n# Adapt for varying lengths across groups\n#for n in range(0, alllosses[0].shape[1], 3):\nfor n in range(0, minminlen):\n    ps.append(scipy.stats.ranksums(alllosses[0][:,n], alllosses[1][:,n]).pvalue)\nps = np.array(ps)\nprint(np.mean(ps[-500:] < .05), np.mean(ps[-500:] < .01))\n\nplt.legend(loc='best', fontsize=14)\n#plt.xlabel('Loss (sum square diff. b/w final output and target)')\nplt.xlabel('Number of Episodes')\nplt.ylabel('Reward')\nplt.xticks(xt, xtl)\n#plt.tight_layout()\n\n\n\n"
  },
  {
    "path": "maze/plotresults.py",
    "content": "# Code for plotting results\n#\n# Copyright (c) 2018 Uber Technologies, Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#    http://www.apache.org/licenses/LICENSE-2.0\n#\n#    Unless required by applicable law or agreed to in writing, software\n#    distributed under the License is distributed on an \"AS IS\" BASIS,\n#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n#    See the License for the specific language governing permissions and\n#    limitations under the License.\n\n\nimport numpy as np\nimport glob\nimport matplotlib.pyplot as plt\nimport scipy\nfrom scipy import stats\n\ncolorz = ['r', 'b', 'g', 'c', 'm', 'y', 'orange', 'k']\n\ngroupnames = glob.glob('./tmp/loss_*new*eplen_251*rngseed_0.txt')  \n#groupnames = glob.glob('./tmp/loss_*new*eplen_250*rngseed_0.txt')  \n#groupnames = glob.glob('./tmp/loss_*new*.9_*rngseed_0.txt')  \n\n\n\n# If you can only use 7 runs, smooth the losses within each run to obtain more reliable estimates of performance!\n\n\ndef mavg(x, N):\n  cumsum = np.cumsum(np.insert(x, 0, 0)) \n  return (cumsum[N:] - cumsum[:-N]) / N\n\nplt.ion()\n#plt.figure(figsize=(5,4))  # Smaller figure = relative larger fonts\nplt.figure()\n\nallmedianls = []\nalllosses = []\nposcol = 0\nminminlen = 999999\nfor numgroup, groupname in enumerate(groupnames):\n    #if \"lstm\" not in groupname:\n    #    continue\n    g = groupname[:-6]+\"*\"\n    print(\"====\", groupname)\n    fnames = glob.glob(g)\n    fulllosses=[]\n    losses=[]\n    lgts=[]\n    for fn in fnames:\n        if \"COPY\" in fn:\n            continue\n        if False:\n            #if \"seed_3\" in fn:\n            #    continue\n            #if \"seed_7\" in fn:\n            #    continue\n            if \"seed_8\" in fn:\n                continue\n            if \"seed_9\" in fn:\n                continue\n            if \"seed_10\" in fn:\n                continue\n            if \"seed_11\" in fn:\n                continue\n            if \"seed_12\" in fn:\n                continue\n            if \"seed_13\" in fn:\n                continue\n            if \"seed_14\" in fn:\n                continue\n            if \"seed_15\" in fn:\n                continue\n        z = np.loadtxt(fn)\n        \n        z = mavg(z, 10)  # For each run, we average the losses over K successive episodes\n\n        z = z[::10] # Decimation - speed things up!\n        \n        if len(z) < 500:\n            print(fn)\n            continue\n        #z = z[:90]\n        lgts.append(len(z))\n        fulllosses.append(z)\n    minlen = min(lgts)\n    if minlen < minminlen:\n        minminlen = minlen\n    print(minlen)\n    #if minlen < 1000:\n    #    continue\n    for z in fulllosses:\n        losses.append(z[:minlen])\n\n    losses = np.array(losses)\n    alllosses.append(losses)\n    \n    meanl = np.mean(losses, axis=0)\n    stdl = np.std(losses, axis=0)\n    cil = stdl / np.sqrt(losses.shape[0]) * 1.96  # 95% confidence interval - assuming normality\n    #cil = stdl / np.sqrt(losses.shape[0]) * 2.5  # 95% confidence interval - approximated with the t-distribution for 7 d.f.\n\n    medianl = np.median(losses, axis=0)\n    allmedianls.append(medianl)\n    q1l = np.percentile(losses, 25, axis=0)\n    q3l = np.percentile(losses, 75, axis=0)\n    \n    highl = np.max(losses, axis=0)\n    lowl = np.min(losses, axis=0)\n    #highl = meanl+stdl\n    #lowl = meanl-stdl\n\n    xx = range(len(meanl))\n\n    # xticks and labels\n    xt = range(0, len(meanl), 500)\n    xtl = [str(10 * 10 * i) for i in xt]   # Because of decimation above, and only every 10th loss is recorded in the files\n\n    #plt.plot(mavg(meanl, 100), label=g) #, color='blue')\n    #plt.fill_between(xx, lowl, highl,  alpha=.2)\n    #plt.fill_between(xx, q1l, q3l,  alpha=.1)\n    #plt.plot(meanl) #, color='blue')\n    ####plt.plot(mavg(medianl, 100), label=g) #, color='blue')  # mavg changes the number of points !\n    #plt.plot(mavg(q1l, 100), label=g, alpha=.3) #, color='blue')\n    #plt.plot(mavg(q3l, 100), label=g, alpha=.3) #, color='blue')\n    #plt.fill_between(xx, q1l, q3l,  alpha=.2)\n    #plt.plot(medianl, label=g) #, color='blue')\n   \n    AVGSIZE = 20\n    \n    xlen = len(mavg(q1l, AVGSIZE))\n    plt.plot(mavg(medianl, AVGSIZE), label=g, color=colorz[poscol % len(colorz)])  # mavg changes the number of points !\n    plt.fill_between( range(xlen), mavg(q1l, AVGSIZE), mavg(q3l, AVGSIZE),  alpha=.2, color=colorz[poscol % len(colorz)])\n    \n    #xlen = len(mavg(meanl, AVGSIZE))\n    #plt.plot(mavg(meanl, AVGSIZE), label=g, color=colorz[poscol % len(colorz)])  # mavg changes the number of points !\n    #plt.fill_between( range(xlen), mavg(meanl - cil, AVGSIZE), mavg(meanl + cil, AVGSIZE),  alpha=.2, color=colorz[poscol % len(colorz)])\n    \n    poscol += 1\n    \n    #plt.fill_between( range(xlen), mavg(lowl, 100), mavg(highl, 100),  alpha=.2, color=colorz[numgroup % len(colorz)])\n\n    #plt.plot(mavg(losses[0], 1000), label=g, color=colorz[numgroup % len(colorz)])\n    #for curve in losses[1:]:\n    #    plt.plot(mavg(curve, 1000), color=colorz[numgroup % len(colorz)])\n\nps = []\n# Adapt for varying lengths across groups\n#for n in range(0, alllosses[0].shape[1], 3):\n\n#for n in range(0, minminlen):\n#    ps.append(scipy.stats.ranksums(alllosses[0][:,n], alllosses[1][:,n]).pvalue)\n#ps = np.array(ps)\n\nplt.legend(loc='best', fontsize=6)\n#plt.xlabel('Loss (sum square diff. b/w final output and target)')\nplt.xlabel('Number of Episodes')\nplt.ylabel('Loss')\nplt.xticks(xt, xtl)\n#plt.tight_layout()\n\n\n\n"
  },
  {
    "path": "maze/request.json",
    "content": "{\n    \"dockerImage\":\"tmiconi_rl\", \n    \"tag\":\"master-test-2018_2_9_17_17_4\",\n    \"name\":\"Exp10_new_B_gr9_hs_100_labsize_11_eplen251_lstmplastic\",\n    \"cpus\":2.0,\n    \"cmdLine\":\"export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 \\u0026\\u0026  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/nvidia/bin:/opt/hadoop/latest/bin \\u0026\\u0026 export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/nvidia/lib64/  \\u0026\\u0026  export LC_ALL=C.UTF-8  \\u0026\\u0026   export LANG=C.UTF-8  \\u0026\\u0026  cd /home/work/ \\u0026\\u0026 python3 gridlab.py   --nbiter 1000000 --rule oja --squash 0 --hiddensize 100 --type lstmplastic --lr 1e-4 --eplen 251 --print_every 100 --save_every 1000  --bentropy 0.1 --blossv .03 --randstart 1 --gr .9 --rp 0 --labsize 11 --rngseed {{mesos.instance}}\",\n    \"ramMB\":8000,\n    \"gpus\":1,\n    \"diskMB\":8000,\n    \"cluster\":\"opusprodda1\",\n    \"environment\":\"devel\",\n    \"user\":\"tmiconi_peloton\",\n    \"instances\":15,\n    \"isService\":false,\n    \"cronSchedule\":\"\",\n    \"custom\":{},\n    \"application\":\"testversion\",\n    \"maxRetries\":1,\n    \"constraints\":{\"sku\":\"1080ti\"},\n    \"accessTypes\":[],\n    \"dependencies\":[],\n    \"cronCollisionPolicy\":\"CANCEL_NEW\",\n    \"emailOnFail\":[],\n    \"emailOnSucceed\":[]\n}\n"
  },
  {
    "path": "maze/request_devbox.json",
    "content": "{\n    \"dockerImage\":\"tmiconi_rl\", \n    \"tag\":\"master-test-2019_5_3_11_4_2\",\n    \"cpus\":2.0,\n    \"ramMB\":6000,\n    \"gpus\":1,\n    \"diskMB\":8000,\n    \"cluster\":\"opusprodda1\",\n    \"environment\":\"devel\",\n    \"user\":\"tmiconi\",\n    \"resourcePool\": \"/ailabs/p1/tmiconi\",\n    \"instances\":1,\n    \"isService\":false,\n    \"cronSchedule\":\"\",\n    \"custom\":{},\n    \"application\":\"testversion\",\n    \"maxRetries\":1,\n    \"constraints\":{\"sku\":\"p6000\"},\n    \"accessTypes\":[],\n    \"dependencies\":[],\n    \"cronCollisionPolicy\":\"CANCEL_NEW\",\n    \"emailOnFail\":[],\n    \"emailOnSucceed\":[]\n}\n"
  },
  {
    "path": "maze/request_modplast.json",
    "content": "{\n    \"dockerImage\":\"tmiconi_rl\", \n    \"tag\":\"master-test-2019_5_3_11_4_2\",\n    \"name\":\"Maze_Modplast_hs100_eplen200_addpw3_bv0.1_bent0.03_rew10_bs30_gc4_lr1e-4_l20\",\n    \"cpus\":2.0,\n    \"cmdLine\":\"export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 \\u0026\\u0026  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/nvidia/bin:/opt/hadoop/latest/bin \\u0026\\u0026 export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/nvidia/lib64/  \\u0026\\u0026  export LC_ALL=C.UTF-8  \\u0026\\u0026   export LANG=C.UTF-8  \\u0026\\u0026  cd /home/work/ \\u0026\\u0026 python3 batch.py  --eplen 200 --hs 100  --lr 1e-4 --l2 0 --addpw 3 --pe 1000 --blossv 0.1 --bent 0.03 --rew 10 --save_every 1000 --rsp 1 --type modplast --da tanh  --nbiter 200002 --msize 13  --wp 0.0 --bs 30 --gc 4.0 --rngseed {{mesos.instance}}\",\n    \"ramMB\":8000,\n    \"gpus\":1,\n    \"diskMB\":8000,\n    \"cluster\":\"opusprodda1\",\n    \"environment\":\"devel\",\n    \"user\":\"tmiconi\",\n    \"resourcePool\": \"/ailabs/p1/tmiconi\",\n    \"instances\":10,\n    \"isService\":false,\n    \"cronSchedule\":\"\",\n    \"custom\":{},\n    \"application\":\"testversion\",\n    \"maxRetries\":1,\n    \"constraints\":{\"sku\":\"p6000\"},\n    \"accessTypes\":[],\n    \"dependencies\":[],\n    \"cronCollisionPolicy\":\"CANCEL_NEW\",\n    \"emailOnFail\":[],\n    \"emailOnSucceed\":[]\n}\n"
  },
  {
    "path": "maze/request_modul.json",
    "content": "{\n    \"dockerImage\":\"tmiconi_rl\", \n    \"tag\":\"master-test-2019_5_3_11_4_2\",\n    \"name\":\"Maze_Modul_hs100_eplen200_addpw3_bv0.1_bent0.03_rew10_bs30_gc4_lr1e-4_l20\",\n    \"cpus\":2.0,\n    \"cmdLine\":\"export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 \\u0026\\u0026  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/nvidia/bin:/opt/hadoop/latest/bin \\u0026\\u0026 export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/nvidia/lib64/  \\u0026\\u0026  export LC_ALL=C.UTF-8  \\u0026\\u0026   export LANG=C.UTF-8  \\u0026\\u0026  cd /home/work/ \\u0026\\u0026 python3 batch.py  --eplen 200 --hs 100  --lr 1e-4 --l2 0 --addpw 3 --pe 1000 --blossv 0.1 --bent 0.03 --rew 10 --save_every 1000 --rsp 1 --type modul --da tanh  --nbiter 200002 --msize 13  --wp 0.0 --bs 30 --gc 4.0 --rngseed {{mesos.instance}}\",\n    \"ramMB\":8000,\n    \"gpus\":1,\n    \"diskMB\":8000,\n    \"cluster\":\"opusprodda1\",\n    \"environment\":\"devel\",\n    \"user\":\"tmiconi\",\n    \"resourcePool\": \"/ailabs/p1/tmiconi\",\n    \"instances\":10,\n    \"isService\":false,\n    \"cronSchedule\":\"\",\n    \"custom\":{},\n    \"application\":\"testversion\",\n    \"maxRetries\":1,\n    \"constraints\":{\"sku\":\"p6000\"},\n    \"accessTypes\":[],\n    \"dependencies\":[],\n    \"cronCollisionPolicy\":\"CANCEL_NEW\",\n    \"emailOnFail\":[],\n    \"emailOnSucceed\":[]\n}\n"
  },
  {
    "path": "maze/request_plastic.json",
    "content": "{\n    \"dockerImage\":\"tmiconi_rl\", \n    \"tag\":\"master-test-2019_5_3_11_4_2\",\n    \"name\":\"Maze_Plastic_hs101_eplen200_addpw3_bv0.1_bent0.03_rew10_bs30_gc4_lr1e-4_l20\",\n    \"cpus\":2.0,\n    \"cmdLine\":\"export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 \\u0026\\u0026  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/nvidia/bin:/opt/hadoop/latest/bin \\u0026\\u0026 export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/nvidia/lib64/  \\u0026\\u0026  export LC_ALL=C.UTF-8  \\u0026\\u0026   export LANG=C.UTF-8  \\u0026\\u0026  cd /home/work/ \\u0026\\u0026 python3 batch.py  --eplen 200 --hs 101  --lr 1e-4 --l2 0 --addpw 3 --pe 1000 --blossv 0.1 --bent 0.03 --rew 10 --save_every 1000 --rsp 1 --type plastic --da tanh  --nbiter 200002 --msize 13  --wp 0.0 --bs 30 --gc 4.0 --rngseed {{mesos.instance}}\",\n    \"ramMB\":8000,\n    \"gpus\":1,\n    \"diskMB\":8000,\n    \"cluster\":\"opusprodda1\",\n    \"environment\":\"devel\",\n    \"user\":\"tmiconi\",\n    \"resourcePool\": \"/ailabs/p1/tmiconi\",\n    \"instances\":10,\n    \"isService\":false,\n    \"cronSchedule\":\"\",\n    \"custom\":{},\n    \"application\":\"testversion\",\n    \"maxRetries\":1,\n    \"constraints\":{\"sku\":\"p6000\"},\n    \"accessTypes\":[],\n    \"dependencies\":[],\n    \"cronCollisionPolicy\":\"CANCEL_NEW\",\n    \"emailOnFail\":[],\n    \"emailOnSucceed\":[]\n}\n"
  },
  {
    "path": "maze/request_rnn.json",
    "content": "{\n    \"dockerImage\":\"tmiconi_rl\", \n    \"tag\":\"master-test-2019_5_3_11_4_2\",\n    \"name\":\"Maze_RNN_hs139_eplen200_addpw3_bv0.1_bent0.03_rew10_bs30_gc4_lr1e-4_l20\",\n    \"cpus\":2.0,\n    \"cmdLine\":\"export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 \\u0026\\u0026  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/nvidia/bin:/opt/hadoop/latest/bin \\u0026\\u0026 export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/nvidia/lib64/  \\u0026\\u0026  export LC_ALL=C.UTF-8  \\u0026\\u0026   export LANG=C.UTF-8  \\u0026\\u0026  cd /home/work/ \\u0026\\u0026 python3 batch.py  --eplen 200 --hs 139 --lr 1e-4 --l2 0 --addpw 3 --pe 1000 --blossv 0.1 --bent 0.03 --rew 10 --save_every 1000 --rsp 1 --type rnn --da tanh  --nbiter 200002 --msize 13  --wp 0.0 --bs 30 --gc 4.0 --rngseed {{mesos.instance}}\",\n    \"ramMB\":8000,\n    \"gpus\":1,\n    \"diskMB\":8000,\n    \"cluster\":\"opusprodda1\",\n    \"environment\":\"devel\",\n    \"user\":\"tmiconi\",\n    \"resourcePool\": \"/ailabs/p1/tmiconi\",\n    \"instances\":10,\n    \"isService\":false,\n    \"cronSchedule\":\"\",\n    \"custom\":{},\n    \"application\":\"testversion\",\n    \"maxRetries\":1,\n    \"constraints\":{\"sku\":\"p6000\"},\n    \"accessTypes\":[],\n    \"dependencies\":[],\n    \"cronCollisionPolicy\":\"CANCEL_NEW\",\n    \"emailOnFail\":[],\n    \"emailOnSucceed\":[]\n}\n"
  },
  {
    "path": "maze/request_rnn100neurons.json",
    "content": "{\n    \"dockerImage\":\"tmiconi_rl\", \n    \"tag\":\"master-test-2019_5_3_11_4_2\",\n    \"name\":\"Maze_RNN_hs00_eplen200_addpw3_bv0.1_bent0.03_rew10_bs30_gc4_lr1e-4_l20\",\n    \"cpus\":2.0,\n    \"cmdLine\":\"export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 \\u0026\\u0026  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/nvidia/bin:/opt/hadoop/latest/bin \\u0026\\u0026 export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/nvidia/lib64/  \\u0026\\u0026  export LC_ALL=C.UTF-8  \\u0026\\u0026   export LANG=C.UTF-8  \\u0026\\u0026  cd /home/work/ \\u0026\\u0026 python3 batch.py  --eplen 200 --hs 100 --lr 1e-4 --l2 0 --addpw 3 --pe 1000 --blossv 0.1 --bent 0.03 --rew 10 --save_every 1000 --rsp 1 --type rnn --da tanh  --nbiter 200002 --msize 13  --wp 0.0 --bs 30 --gc 4.0 --rngseed {{mesos.instance}}\",\n    \"ramMB\":8000,\n    \"gpus\":1,\n    \"diskMB\":8000,\n    \"cluster\":\"opusprodda1\",\n    \"environment\":\"devel\",\n    \"user\":\"tmiconi\",\n    \"resourcePool\": \"/ailabs/p1/tmiconi\",\n    \"instances\":10,\n    \"isService\":false,\n    \"cronSchedule\":\"\",\n    \"custom\":{},\n    \"application\":\"testversion\",\n    \"maxRetries\":1,\n    \"constraints\":{\"sku\":\"p6000\"},\n    \"accessTypes\":[],\n    \"dependencies\":[],\n    \"cronCollisionPolicy\":\"CANCEL_NEW\",\n    \"emailOnFail\":[],\n    \"emailOnSucceed\":[]\n}\n"
  },
  {
    "path": "maze/testbatch.py",
    "content": "import argparse\nimport pdb\nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\nimport numpy as np\nfrom numpy import random\nimport torch.nn.functional as F\nfrom torch import optim\nfrom torch.optim import lr_scheduler\nimport random\nimport sys\nimport pickle\nimport time\nimport os\nimport platform\n#import makemaze\n\nimport numpy as np\n#import matplotlib.pyplot as plt\nimport glob\n\n\n\n\n\nnp.set_printoptions(precision=4)\n\nNBDA = 1  # Number of different DA output neurons. At present, the code assumes NBDA=1 and will NOT WORK if you change this.\n\n\nnp.set_printoptions(precision=4)\n\n\nADDINPUT = 4 # 1 inputs for the previous reward, 1 inputs for numstep, 1 unused,  1 \"Bias\" inputs\n\nNBACTIONS = 4  # U, D, L, R\n\nRFSIZE = 3 # Receptive Field\n\nTOTALNBINPUTS =  RFSIZE * RFSIZE + ADDINPUT + NBACTIONS\n\n##ttype = torch.FloatTensor;\n#ttype = torch.cuda.FloatTensor;\n\n##ttype = torch.FloatTensor;\n#ttype = torch.cuda.FloatTensor;\n\n\nclass Network(nn.Module):\n    def __init__(self, params):\n        super(Network, self).__init__()\n        self.rule = params['rule']\n        self.type = params['type']\n        self.softmax= torch.nn.functional.softmax\n        #if params['activ'] == 'tanh':\n        self.activ = F.tanh\n        #elif params['activ'] == 'selu':\n        #    self.activ = F.selu\n        #else:\n        #    raise ValueError('Must choose an activ function')\n        if params['type'] == 'lstm':\n            self.lstm = torch.nn.LSTM(TOTALNBINPUTS, params['hs']).cuda()\n        elif params['type'] == 'rnn':\n            self.i2h = torch.nn.Linear(TOTALNBINPUTS, params['hs']).cuda()\n            self.w =  torch.nn.Parameter((.01 * torch.rand(params['hs'], params['hs'])).cuda(), requires_grad=True)\n            #self.inputnegmask = Variable(torch.ones(1, params['hs']), requires_grad=False).cuda()\n            #self.inputnegmask[0, :TOTALNBINPUTS] = 0   # no modulation for 2nd half\n        elif params['type'] == 'modplast' or params['type'] == 'modplast2':\n            self.i2h = torch.nn.Linear(TOTALNBINPUTS, params['hs']).cuda()\n            self.w =  torch.nn.Parameter((.01 * torch.rand(params['hs'], params['hs'])).cuda(), requires_grad=True)\n            self.alpha =  torch.nn.Parameter((.01 * torch.rand(params['hs'], params['hs'])).cuda(), requires_grad=True)\n            #self.w =  torch.nn.Parameter((.01 * torch.t(torch.rand(params['hs'], params['hs']))).cuda(), requires_grad=True)\n            #self.alpha =  torch.nn.Parameter((.01 * torch.t(torch.rand(params['hs'], params['hs']))).cuda(), requires_grad=True)\n            self.eta = torch.nn.Parameter((.01 * torch.ones(1)).cuda(), requires_grad=True)  # Everyone has the same eta\n            #self.inputnegmask = Variable(torch.ones(1, params['hs']), requires_grad=False).cuda()\n            #self.inputnegmask[0, :TOTALNBINPUTS] = 0   # no modulation for 2nd half\n            self.h2DA = torch.nn.Linear(params['hs'], NBDA).cuda()\n        elif params['type'] == 'plastic' :\n            self.i2h = torch.nn.Linear(TOTALNBINPUTS, params['hs']).cuda()\n            self.w =  torch.nn.Parameter((.01 * torch.rand(params['hs'], params['hs'])).cuda(), requires_grad=True)\n            self.alpha =  torch.nn.Parameter((.01 * torch.rand(params['hs'], params['hs'])).cuda(), requires_grad=True)\n            self.eta = torch.nn.Parameter((.01 * torch.ones(1)).cuda(), requires_grad=True)  # Everyone has the same eta\n            #self.inputnegmask = Variable(torch.ones(1, params['hs']), requires_grad=False).cuda()\n            #self.inputnegmask[0, :TOTALNBINPUTS] = 0   # no modulation for 2nd half\n        elif params['type'] == 'modul' or params['type'] == 'modul2':\n            self.i2h = torch.nn.Linear(TOTALNBINPUTS, params['hs']).cuda()\n            self.w =  torch.nn.Parameter((.01 * torch.rand(params['hs'], params['hs'])).cuda(), requires_grad=True)\n            self.alpha =  torch.nn.Parameter((.01 * torch.rand(params['hs'], params['hs'])).cuda(), requires_grad=True)\n            # Note that initial eta is higher (faster) thanbefore\n            self.eta = torch.nn.Parameter((.1 * torch.ones(1)).cuda(), requires_grad=True)  # Everyone has the same eta\n            self.etaet = torch.nn.Parameter((.1 * torch.ones(1)).cuda(), requires_grad=True)  # Everyone has the same etapw\n            self.etapw = torch.nn.Parameter((.1 * torch.ones(1)).cuda(), requires_grad=True)  # Everyone has the same etapw\n            self.h2DA = torch.nn.Linear(params['hs'], NBDA).cuda()\n            # The daweights vectors are weight vectors from the DA output neurons to the network hidden (recurrent) neurons\n            #self.daweights0 = Variable(torch.ones(1, params['hs']), requires_grad=False).cuda()\n            #self.daweights0[0, (params['hs'] // 2):] = 0   # no modulation for 2nd half\n            #self.inputnegmask = Variable(torch.ones(1, params['hs']), requires_grad=False).cuda()\n            #self.inputnegmask[0, :TOTALNBINPUTS] = 0   # no modulation for 2nd half\n\n            #else:\n            #    raise ValueError(\"Must specify which half of the network receives modulation\")\n            self.daweights1 = Variable(torch.ones(1, params['hs']), requires_grad=False).cuda()\n            self.daweights1[0, :(params['hs'] // 4)] = 0\n            self.daweights1[0, -(params['hs'] // 4):] = 0\n        else:\n            raise ValueError(\"Which network type?\")\n        self.h2o = torch.nn.Linear(params['hs'], NBACTIONS).cuda()\n        self.h2v = torch.nn.Linear(params['hs'], 1).cuda()\n        self.params = params\n\n        # Notice that the vectors are row vectors, and the matrices are transposed wrt the usual order, following apparent pytorch conventions\n        # Each *column* of w targets a single output neuron\n\n    def forward(self, inputs, hidden, hebb, et, pw):\n        BATCHSIZE = self.params['bs']\n        HS = self.params['hs']\n\n        if self.type == 'rnn':\n            hactiv = self.activ(self.i2h(inputs).view(BATCHSIZE, HS, 1) + torch.matmul(self.w.view(1, HS, HS),\n                hidden.view(BATCHSIZE, HS, 1))).view(BATCHSIZE, HS)\n            hidden = hactiv\n            #activout = self.softmax(self.h2o(hactiv))\n            activout = self.h2o(hactiv)   # Linear!\n            valueout = self.h2v(hactiv)\n            #valueout = 0\n\n\n        elif self.type == 'plastic':\n            # Each row of w and hebb contains the input weights to a single neuron\n            # hidden = x, hactiv = y\n            hactiv = self.activ(self.i2h(inputs).view(BATCHSIZE, HS, 1) + torch.matmul((self.w + torch.mul(self.alpha, hebb)),\n                            hidden.view(BATCHSIZE, HS, 1))).view(BATCHSIZE, HS)\n            activout = self.h2o(hactiv)  # Pure linear, raw scores - will be softmaxed later\n            valueout = self.h2v(hactiv)\n\n            if self.rule == 'hebb':\n                deltahebb =  torch.bmm(hactiv.view(BATCHSIZE, HS, 1), hidden.view(BATCHSIZE, 1, HS)) # batched outer product...should it be other way round?\n            elif self.rule == 'oja':\n                deltahebb =  torch.mul(hactiv.view(BATCHSIZE, HS, 1), (hidden.view(BATCHSIZE, 1, HS) - torch.mul(self.w.view(1, HS, HS), hactiv.view(BATCHSIZE, HS, 1))))\n            else:\n                raise ValueError(\"Must specify learning rule ('hebb' or 'oja')\")\n\n            if self.params['addpw'] == 3:\n                # Note that there is no decay, even in the Hebb-rule case : additive only!\n                # Hard clamp\n                hebb = torch.clamp( hebb +  self.eta * deltahebb, min=-1.0, max=1.0)\n            elif self.params['addpw'] == 2:\n                # Note that there is no decay, even in the Hebb-rule case : additive only!\n                # Soft clamp\n                hebb = torch.clamp( hebb +  torch.clamp(self.eta * deltahebb, min=0.0) * (1 - hebb) +  torch.clamp(self.eta * deltahebb, max=0.0) * (hebb + 1) , min=-1.0, max=1.0)\n            elif self.params['addpw'] == 1: # Purely additive, tends to make the meta-learning diverge. No decay/clamp.\n                hebb = hebb + self.eta * deltahebb\n            elif self.params['addpw'] == 0:\n                # We do it the normal way. Note that here, Hebb-rule is decaying.\n                # There is probably a way to make it more efficient.\n                # Note 2: For Oja's rule, there is no difference between addpw 0 and addpw1\n                if self.rule == 'hebb':\n                    hebb = (1 - self.eta) * hebb + self.eta * deltahebb\n                elif self.rule == 'oja':\n                    hebb =  hebb + self.eta  * deltahebb\n\n            hidden = hactiv\n\n\n        elif self.type == 'modplast':\n            # The actual network update should be the same as for \"plastic\". Only the Hebbian updates should be different\n            # The columns of w and pw are the inputs weights to a single neuron\n            hactiv = self.activ(self.i2h(inputs) + hidden.mm(self.w.view(HS, HS) + torch.mul(self.alpha.view(HS,HS), hebb)))\n            activout = self.h2o(hactiv)  # Pure linear, raw scores - will be softmaxed later\n            valueout = self.h2v(hactiv)\n\n            # Now computing the Hebbian updates...\n\n            if self.params['da'] == 'tanh':\n                DAout = F.tanh(self.h2DA(hactiv))\n            elif self.params['da'] == 'sig':\n                DAout = F.sigmoid(self.h2DA(hactiv))\n            elif self.params['da'] == 'lin':\n                DAout =  self.h2DA(hactiv)\n            else:\n                raise ValueError(\"Which transformation for DAout ?\")\n\n            if self.rule == 'hebb':\n                deltahebb = torch.bmm(hidden.unsqueeze(2), hactiv.unsqueeze(1))[0]\n            elif self.rule == 'oja':\n                deltahebb = torch.mul((hidden[0].unsqueeze(1) - torch.mul(hebb , hactiv[0].unsqueeze(0))) , hactiv[0].unsqueeze(0))\n\n            if self.params['addpw'] == 3: # Hard clamp, purely additive\n                # Note that we do the same for Hebb and Oja's rule\n                hebb1 = torch.clamp(hebb + DAout[0,0] * deltahebb, min=-1.0, max=1.0)\n            elif self.params['addpw'] == 2:\n                # Note that there is no decay, even in the Hebb-rule case : additive only!\n                hebb1 = torch.clamp( hebb +  torch.clamp(DAout[0,0] * deltahebb, min=0.0) * (1 - hebb) +  torch.clamp(DAout[0,0] * deltahebb, max=0.0) * (hebb + 1) , min=-1.0, max=1.0)\n            elif self.params['addpw'] == 1: # Purely additive, tends to make the meta-learning diverge\n                # Note that we do the same for Hebb and Oja's rule\n                hebb1 = hebb + DAout[0,0] * deltahebb\n\n            elif self.params['addpw'] == 0:\n                # We do it the normal way. Note that here, Hebb-rule is decaying.\n                # There is probably a way to make it more efficient by grouping it with the computation of the other (non-modulated) half.\n                # NOTE: This can go awry if DAout can go negative!\n                # Note 2: For Oja's rule, there is no difference between addpw 0 and addpw1\n                if self.rule == 'hebb':\n                    hebb1 = (1 - DAout[0,0]) * hebb + DAout[0,0] * deltahebb\n                elif self.rule == 'oja':\n                    hebb1=  hebb + DAout[0,0] * deltahebb\n            else:\n                raise ValueError(\"Which additive form for plastic weights?\")\n\n            # The non-neuromodulated half of the network just does standard plasticity, using learned self.eta.\n            if self.rule == 'hebb':\n                hebb2 = (1 - self.eta) * hebb + self.eta * deltahebb\n            elif self.rule == 'oja':\n                hebb2 = hebb + self.eta * deltahebb\n            else:\n                raise ValueError(\"Must specify learning rule ('hebb' or 'oja')\")\n\n            if self.params['fm'] == 1:\n                hebb = hebb1\n            elif self.params['fm'] == 0:\n                hebb = torch.cat( (hebb1[:, :self.params['hs']//2], hebb2[:, self.params['hs']//2:]), dim=1)\n            else:\n                raise ValueError(\"Must select whether fully modulated or not\")\n\n            hidden = hactiv\n\n\n\n\n\n        elif self.type == 'modplast_old':\n\n            #Here we compute the same deltahebb for the whole network, and use\n            #the same addpw for the whole network too.  #Only difference between\n            #modulated and non-modulated halves is whether eta is the network's\n            #(learned) eta parameter or the neuromodulator output DAout\n\n            # The rows of w and hebb are the inputs weights to a single neuron\n            # hidden = x, hactiv = y\n            hactiv = self.activ(self.i2h(inputs).view(BATCHSIZE, HS, 1) + torch.matmul((self.w + torch.mul(self.alpha, hebb)),\n                            hidden.view(BATCHSIZE, HS, 1))).view(BATCHSIZE, HS)\n            activout = self.h2o(hactiv)  # Pure linear, raw scores - will be softmaxed later\n            valueout = self.h2v(hactiv)\n\n            # Now computing the Hebbian updates...\n\n            # With batching, DAout is a matrix of size BS x 1 (Really BS x NBDA, but we assume NBDA=1 for now in the deltahebb multiplication below)\n            if self.params['da'] == 'tanh':\n                DAout = F.tanh(self.h2DA(hactiv))\n            elif self.params['da'] == 'sig':\n                DAout = F.sigmoid(self.h2DA(hactiv))\n            elif self.params['da'] == 'lin':\n                DAout =  self.h2DA(hactiv)\n            else:\n                raise ValueError(\"Which transformation for DAout ?\")\n\n            # deltahebb has shape BS x HS x HS\n            # Each row of hebb contain the input weights to a neuron\n            if self.rule == 'hebb':\n                deltahebb =  torch.bmm(hactiv.view(BATCHSIZE, HS, 1), hidden.view(BATCHSIZE, 1, HS)) # batched outer product...should it be other way round?\n            elif self.rule == 'oja':\n                deltahebb =  torch.mul(hactiv.view(BATCHSIZE, HS, 1), (hidden.view(BATCHSIZE, 1, HS) - torch.mul(self.w.view(1, HS, HS), hactiv.view(BATCHSIZE, HS, 1))))\n            else:\n                raise ValueError(\"Must specify learning rule ('hebb' or 'oja')\")\n\n\n            if self.params['addpw'] == 3: # Hard clamp, purely additive\n                # Note that we do the same for Hebb and Oja's rule\n                hebb1 = torch.clamp(hebb + DAout.view(BATCHSIZE, 1, 1) * deltahebb, min=-1.0, max=1.0)\n                hebb2 = torch.clamp(hebb + self.eta * deltahebb, min=-1.0, max=1.0)\n            elif self.params['addpw'] == 2:\n                # Note that there is no decay, even in the Hebb-rule case : additive only!\n                hebb1 = torch.clamp( hebb +  torch.clamp(DAout.view(BATCHSIZE, 1, 1) * deltahebb, min=0.0) * (1 - hebb) +\n                        torch.clamp(DAout.view(BATCHSIZE, 1, 1)  * deltahebb, max=0.0) * (hebb + 1) , min=-1.0, max=1.0)\n                hebb2 = torch.clamp( hebb +  torch.clamp(self.eta * deltahebb, min=0.0) * (1 - hebb) +  torch.clamp(self.eta * deltahebb, max=0.0) * (hebb + 1) , min=-1.0, max=1.0)\n            elif self.params['addpw'] == 1: # Purely additive. This will almost certainly diverge, don't use it!\n                hebb1 = hebb + DAout.view(BATCHSIZE, 1, 1) * deltahebb\n                hebb2 = hebb + self.eta * deltahebb\n\n            elif self.params['addpw'] == 0:\n                # We do it the old way. Note that here, Hebb-rule is decaying.\n                # There is probably a way to make it more efficient\n                # NOTE: THIS WILL GO AWRY if DAout is allowed to go outside [0,1]!\n                # Note 2: For Oja's rule, there is no difference between addpw 0 and addpw1\n                if self.rule == 'hebb':\n                    hebb1 = (1 - DAout.view(BATCHSIZE,1,1)) * hebb + DAout.view(BATCHSIZE, 1, 1) * deltahebb\n                    hebb2 = (1 - self.eta) * hebb + self.eta *  deltahebb\n                elif self.rule == 'oja':\n                    hebb1=  hebb + DAout.view(BATCHSIZE, 1, 1) * deltahebb\n                    hebb2=  hebb + self.eta * deltahebb\n            else:\n                raise ValueError(\"Which additive form for plastic weights?\")\n\n            if self.params['fm'] == 1:\n                hebb = hebb1\n            elif self.params['fm'] == 0:\n                hebb = torch.cat( (hebb1[:, :self.params['hs']//2, :], hebb2[:,  self.params['hs'] // 2:, :]), dim=1) # Maybe along dim=2 instead?...\n            else:\n                raise ValueError(\"Must select whether fully modulated or not\")\n\n            hidden = hactiv\n\n\n        elif self.type == 'modul':\n\n            # The rows of w and hebb are the inputs weights to a single neuron\n            # hidden = x, hactiv = y\n            hactiv = self.activ(self.i2h(inputs).view(BATCHSIZE, HS, 1) + torch.matmul((self.w + torch.mul(self.alpha, hebb)),\n                            hidden.view(BATCHSIZE, HS, 1))).view(BATCHSIZE, HS)\n            activout = self.h2o(hactiv)  # Pure linear, raw scores - will be softmaxed later\n            valueout = self.h2v(hactiv)\n\n            # Now computing the Hebbian updates...\n\n            # With batching, DAout is a matrix of size BS x 1 (Really BS x NBDA, but we assume NBDA=1 for now in the deltahebb multiplication below)\n            if self.params['da'] == 'tanh':\n                DAout = F.tanh(self.h2DA(hactiv))\n            elif self.params['da'] == 'sig':\n                DAout = F.sigmoid(self.h2DA(hactiv))\n            elif self.params['da'] == 'lin':\n                DAout =  self.h2DA(hactiv)\n            else:\n                raise ValueError(\"Which transformation for DAout ?\")\n\n\n            # We need to select the order of operations; network update, e.t. update, neuromodulated incorporation into plastic weights\n            # One possibility (for now go with this one):\n            #    - computing all outputs from current inputs, including DA\n            #    - incorporating neuromodulated Hebb/eligibility trace into plastic weights\n            #    - computing updated hebb/eligibility traces\n            # Another possibility (modul2):\n            #    - computing all outputs from current inputs, including DA\n            #    - computing updated Hebb/eligibility traces\n            #    - incorporating this modified Hebb into plastic weights through neuromodulation\n\n\n            # For Hebb (not et or pw); this is only used if fm=0, for the non-modulated part of the network\n            # If fm=0:\n            # One half of the network receives neuromodulation. The other just\n            # does plain Hebbian plasticity; note that the eta's for the\n            # Hebbian trace and the eligibility trace are different\n            if self.params['fm']==0:\n                if self.rule == 'hebb':\n                    deltahebb =  torch.bmm(hactiv.view(BATCHSIZE, HS, 1), hidden.view(BATCHSIZE, 1, HS)) # batched outer product...should it be other way round?\n                elif self.rule == 'oja':\n                    deltahebb =  torch.mul(hactiv.view(BATCHSIZE, HS, 1), (hidden.view(BATCHSIZE, 1, HS) - torch.mul(self.w.view(1, HS, HS), hactiv.view(BATCHSIZE, HS, 1))))\n                else:\n                    raise ValueError(\"Must specify learning rule ('hebb' or 'oja')\")\n\n            # In modul2 we compute deltaet and update et here too; here we compute them later\n\n            if self.params['addpw'] == 3:\n                # Hard clamp\n                deltapw = DAout.view(BATCHSIZE,1,1) * et\n                pw1 = torch.clamp(pw + deltapw, min=-1.0, max=1.0)\n                if self.params['fm']==0:\n                    hebb = torch.clamp(hebb + self.eta * deltahebb, min=-1.0, max=1.0)\n            elif self.params['addpw'] == 2:\n                deltapw = DAout.view(BATCHSIZE,1,1) * et\n                # This constrains the pw to stay within [-1, 1] (we could also do that by putting a tanh on top of it, but instead we want pw itself to remain within that range, to avoid large gradients and facilitate movement back to 0)\n                # The outer clamp is there for safety. In theory the expression within that clamp is \"softly\" constrained to stay within [-1, 1], but finite-size effects might throw it off.\n                pw1 = torch.clamp( pw +  torch.clamp(deltapw, min=0.0) * (1 - pw) +  torch.clamp(deltapw, max=0.0) * (pw + 1) , min=-.99999, max=.99999)\n                if self.params['fm']==0:\n                    hebb = torch.clamp( hebb +  torch.clamp(self.eta * deltahebb, min=0.0) * (1 - hebb) +  torch.clamp(self.eta * deltahebb, max=0.0) * (hebb + 1) , min=-1.0, max=1.0)\n            if self.params['addpw'] == 1: # Purely additive, tends to make the meta-learning diverge\n                deltapw = DAout.view(BATCHSIZE,1,1) * et\n                pw1 = pw + deltapw\n                if self.params['fm'] == 0:\n                    hebb = hebb + self.eta * deltahebb\n            elif self.params['addpw'] == 0:\n                # We do it the old way, with a decay term.\n                # This will FAIL if DAout is allowed to go outside [0,1]\n                # Note: this makes the plastic weights decaying!\n                pw1 = (1 - DAout.view(BATCHSIZE,1,1)) * pw1 + DAout.view(BATCHSIZE, 1, 1) * et\n                if self.params['fm']==0:\n                    if self.rule == 'hebb':\n                        hebb = (1 - self.eta) * hebb + self.eta * deltahebb\n                    elif self.rule == 'oja':\n                        hebb=  hebb + self.eta * deltahebb\n            # Should we have a fully neuromodulated network, or only half?\n            if self.params['fm'] == 1:\n                pw = pw1\n            elif self.params['fm'] == 0:\n                pw = torch.cat( (hebb[:, :self.params['hs']//2, :], pw1[:,  self.params['hs'] // 2:, :]), dim=1) # Maybe along dim=2 instead?...\n            else:\n                raise ValueError(\"Must select whether fully modulated or not\")\n\n            # Updating the eligibility trace - always a simple decay term.\n            # Note that self.etaet != self.eta (which is used for hebb, i.e. the non-modulated part)\n            deltaet =  torch.bmm(hactiv.view(BATCHSIZE, HS, 1), hidden.view(BATCHSIZE, 1, HS)) # batched outer product...should it be other way round?\n            et = (1 - self.etaet) * et + self.etaet *  deltaet\n\n            hidden = hactiv\n\n\n\n\n\n        elif self.type == 'modul2':\n\n            # The rows of w and hebb are the inputs weights to a single neuron\n            # hidden = x, hactiv = y\n            hactiv = self.activ(self.i2h(inputs).view(BATCHSIZE, HS, 1) + torch.matmul((self.w + torch.mul(self.alpha, hebb)),\n                            hidden.view(BATCHSIZE, HS, 1))).view(BATCHSIZE, HS)\n            activout = self.h2o(hactiv)  # Pure linear, raw scores - will be softmaxed later\n            valueout = self.h2v(hactiv)\n\n            # Now computing the Hebbian updates...\n\n            # With batching, DAout is a matrix of size BS x 1 (Really BS x NBDA, but we assume NBDA=1 for now in the deltahebb multiplication below)\n            if self.params['da'] == 'tanh':\n                DAout = F.tanh(self.h2DA(hactiv))\n            elif self.params['da'] == 'sig':\n                DAout = F.sigmoid(self.h2DA(hactiv))\n            elif self.params['da'] == 'lin':\n                DAout =  self.h2DA(hactiv)\n            else:\n                raise ValueError(\"Which transformation for DAout ?\")\n\n\n            # We need to select the order of operations; network update, e.t. update, neuromodulated incorporation into plastic weights\n            # One possibility (for now go with this one):\n            #    - computing all outputs from current inputs, including DA\n            #    - incorporating neuromodulated Hebb/eligibility trace into plastic weights\n            #    - computing updated hebb/eligibility traces\n            # Another possibility (modul2):\n            #    - computing all outputs from current inputs, including DA\n            #    - computing updated Hebb/eligibility traces\n            #    - incorporating this modified Hebb into plastic weights through neuromodulation\n\n\n            # For Hebb (not et or pw); this is only used if fm=0, for the non-modulated part of the network\n            # If fm=0:\n            # One half of the network receives neuromodulation. The other just\n            # does plain Hebbian plasticity; note that the eta's for the\n            # Hebbian trace and the eligibility trace are different\n            if self.params['fm']==0:\n                if self.rule == 'hebb':\n                    deltahebb =  torch.bmm(hactiv.view(BATCHSIZE, HS, 1), hidden.view(BATCHSIZE, 1, HS)) # batched outer product...should it be other way round?\n                elif self.rule == 'oja':\n                    deltahebb =  torch.mul(hactiv.view(BATCHSIZE, HS, 1), (hidden.view(BATCHSIZE, 1, HS) - torch.mul(self.w.view(1, HS, HS), hactiv.view(BATCHSIZE, HS, 1))))\n                else:\n                    raise ValueError(\"Must specify learning rule ('hebb' or 'oja')\")\n\n            # Updating the eligibility trace - always a simple decay term.\n            # Note that self.etaet != self.eta (which is used for hebb, i.e. the non-modulated part)\n            deltaet =  torch.bmm(hactiv.view(BATCHSIZE, HS, 1), hidden.view(BATCHSIZE, 1, HS)) # batched outer product...should it be other way round?\n            et = (1 - self.etaet) * et + self.etaet *  deltaet\n\n            if self.params['addpw'] == 3:\n                # Hard clamp\n                deltapw = DAout.view(BATCHSIZE,1,1) * et\n                pw1 = torch.clamp(pw + deltapw, min=-1.0, max=1.0)\n                if self.params['fm']==0:\n                    hebb = torch.clamp(hebb + self.eta * deltahebb, min=-1.0, max=1.0)\n            elif self.params['addpw'] == 2:\n                deltapw = DAout.view(BATCHSIZE,1,1) * et\n                # This constrains the pw to stay within [-1, 1] (we could also do that by putting a tanh on top of it, but instead we want pw itself to remain within that range, to avoid large gradients and facilitate movement back to 0)\n                # The outer clamp is there for safety. In theory the expression within that clamp is \"softly\" constrained to stay within [-1, 1], but finite-size effects might throw it off.\n                pw1 = torch.clamp( pw +  torch.clamp(deltapw, min=0.0) * (1 - pw) +  torch.clamp(deltapw, max=0.0) * (pw + 1) , min=-.99999, max=.99999)\n                if self.params['fm']==0:\n                    hebb = torch.clamp( hebb +  torch.clamp(self.eta * deltahebb, min=0.0) * (1 - hebb) +  torch.clamp(self.eta * deltahebb, max=0.0) * (hebb + 1) , min=-1.0, max=1.0)\n            if self.params['addpw'] == 1: # Purely additive, tends to make the meta-learning diverge\n                deltapw = DAout.view(BATCHSIZE,1,1) * et\n                pw1 = pw + deltapw\n                if self.params['fm'] == 0:\n                    hebb = hebb + self.eta * deltahebb\n            elif self.params['addpw'] == 0:\n                # We do it the old way, with a decay term.\n                # This will FAIL if DAout is allowed to go outside [0,1]\n                # Note: this makes the plastic weights decaying!\n                pw1 = (1 - DAout.view(BATCHSIZE,1,1)) * pw1 + DAout.view(BATCHSIZE, 1, 1) * et\n                if fm==0:\n                    if self.rule == 'hebb':\n                        hebb = (1 - self.eta) * hebb + self.eta * deltahebb\n                    elif self.rule == 'oja':\n                        hebb=  hebb + self.eta * deltahebb\n            # Should we have a fully neuromodulated network, or only half?\n            if self.params['fm'] == 1:\n                pw = pw1\n            elif self.params['fm'] == 0:\n                pw = torch.cat( (hebb[:, :self.params['hs']//2, :], pw1[:,  self.params['hs'] // 2:, :]), dim=1) # Maybe along dim=2 instead?...\n            else:\n                raise ValueError(\"Must select whether fully modulated or not\")\n\n\n            hidden = hactiv\n\n\n\n\n\n        return activout, valueout, hidden, hebb, et, pw\n\n\n\n    def initialZeroHebb(self):\n        #return Variable(torch.zeros(self.params['bs'], self.params['hs'], self.params['hs']) , requires_grad=False).cuda()\n        return Variable(torch.zeros(self.params['hs'], self.params['hs']) , requires_grad=False).cuda()\n    def initialZeroPlasticWeights(self):\n        return Variable(torch.zeros(self.params['bs'], self.params['hs'], self.params['hs']) , requires_grad=False).cuda()\n\n    def initialZeroState(self):\n        BATCHSIZE = self.params['bs']\n        return Variable(torch.zeros(BATCHSIZE, self.params['hs']), requires_grad=False ).cuda()\n\n\n\ndef train(paramdict):\n    #params = dict(click.get_current_context().params)\n\n    #TOTALNBINPUTS =  RFSIZE * RFSIZE + ADDINPUT + NBNONRESTACTIONS\n    print(\"Starting training...\")\n    params = {}\n    #params.update(defaultParams)\n    params.update(paramdict)\n    print(\"Passed params: \", params)\n    print(platform.uname())\n    #params['nbsteps'] = params['nbshots'] * ((params['prestime'] + params['interpresdelay']) * params['nbclasses']) + params['prestimetest']  # Total number of steps per episode\n    suffix = \"btch_\"+\"\".join([str(x)+\"_\" if pair[0] is not 'nbsteps' and pair[0] is not 'rngseed' and pair[0] is not 'save_every' and pair[0] is not 'test_every' and pair[0] is not 'pe' else '' for pair in sorted(zip(params.keys(), params.values()), key=lambda x:x[0] ) for x in pair])[:-1] + \"_rngseed_\" + str(params['rngseed'])   # Turning the parameters into a nice suffix for filenames\n\n    # Initialize random seeds (first two redundant?)\n    print(\"Setting random seeds\")\n    np.random.seed(params['rngseed']); random.seed(params['rngseed']); torch.manual_seed(params['rngseed'])\n    #print(click.get_current_context().params)\n\n    print(\"Initializing network\")\n    net = Network(params)\n    print (\"Shape of all optimized parameters:\", [x.size() for x in net.parameters()])\n    allsizes = [torch.numel(x.data.cpu()) for x in net.parameters()]\n    print (\"Size (numel) of all optimized elements:\", allsizes)\n    print (\"Total size (numel) of all optimized elements:\", sum(allsizes))\n\n    #total_loss = 0.0\n    print(\"Initializing optimizer\")\n    optimizer = torch.optim.Adam(net.parameters(), lr=1.0*params['lr'], eps=1e-4, weight_decay=params['l2'])\n    #optimizer = torch.optim.SGD(net.parameters(), lr=1.0*params['lr'])\n    #scheduler = torch.optim.lr_scheduler.StepLR(optimizer, gamma=params['gamma'], step_size=params['steplr'])\n\n    #LABSIZE = params['lsize']\n    #lab = np.ones((LABSIZE, LABSIZE))\n    #CTR = LABSIZE // 2\n\n    # Simple cross maze\n    #lab[CTR, 1:LABSIZE-1] = 0\n    #lab[1:LABSIZE-1, CTR] = 0\n\n\n    # Double-T maze\n    #lab[CTR, 1:LABSIZE-1] = 0\n    #lab[1:LABSIZE-1, 1] = 0\n    #lab[1:LABSIZE-1, LABSIZE - 2] = 0\n\n    # Grid maze\n    #lab[1:LABSIZE-1, 1:LABSIZE-1].fill(0)\n    #for row in range(1, LABSIZE - 1):\n    #    for col in range(1, LABSIZE - 1):\n    #        if row % 2 == 0 and col % 2 == 0:\n    #            lab[row, col] = 1\n    #lab[CTR,CTR] = 0 # Not strictly necessary, but perhaps helps loclization by introducing a detectable irregularity in the center\n\n    BATCHSIZE = params['bs']\n\n    LABSIZE = params['msize']\n    lab = np.ones((LABSIZE, LABSIZE))\n    CTR = LABSIZE // 2\n\n    # Simple cross maze\n    #lab[CTR, 1:LABSIZE-1] = 0\n    #lab[1:LABSIZE-1, CTR] = 0\n\n\n    # Double-T maze\n    #lab[CTR, 1:LABSIZE-1] = 0\n    #lab[1:LABSIZE-1, 1] = 0\n    #lab[1:LABSIZE-1, LABSIZE - 2] = 0\n\n    # Grid maze\n    lab[1:LABSIZE-1, 1:LABSIZE-1].fill(0)\n    for row in range(1, LABSIZE - 1):\n        for col in range(1, LABSIZE - 1):\n            if row % 2 == 0 and col % 2 == 0:\n                lab[row, col] = 1\n    # Not strictly necessary, but cleaner since we start the agent at the\n    # center for each episode; may help loclization in some maze sizes\n    # (including 13 and 9, but not 11) by introducing a detectable irregularity\n    # in the center:\n    lab[CTR,CTR] = 0\n\n\n\n    all_losses = []\n    all_losses_objective = []\n    all_total_rewards = []\n    all_losses_v = []\n    lossbetweensaves = 0\n    nowtime = time.time()\n    meanrewards = np.zeros((LABSIZE, LABSIZE))\n    meanrewardstmp = np.zeros((LABSIZE, LABSIZE, params['eplen']))\n\n\n    pos = 0\n    hidden = net.initialZeroState()\n    hebb = net.initialZeroHebb()\n    pw = net.initialZeroPlasticWeights()\n\n    #celoss = torch.nn.CrossEntropyLoss() # For supervised learning - not used here\n\n\n    print(\"Starting episodes!\")\n\n    for numiter in range(params['nbiter']):\n\n        PRINTTRACE = 0\n        #if (numiter+1) % (1 + params['pe']) == 0:\n        if (numiter+1) % (params['pe']) == 0:\n            PRINTTRACE = 1\n\n        #lab = makemaze.genmaze(size=LABSIZE, nblines=4)\n        #count = np.zeros((LABSIZE, LABSIZE))\n\n        # Select the reward location for this episode - not on a wall!\n        # And not on the center either! (though not sure how useful that restriction is...)\n        # We always start the episode from the center (when hitting reward, we may teleport either to center or to a random location depending on params['rsp'])\n        posr = {}; posc = {}\n        rposr = {}; rposc = {}\n        for nb in range(BATCHSIZE):\n            # Note: it doesn't really matter if the reward is on the center. All we need is not to put it on a wall or pillar (lab=1)\n            myrposr = 0; myrposc = 0\n            # This one is for positioning the reward only in the periphery!\n            #while lab[myrposr, myrposc] == 1 or (myrposr != 1 and myrposr != LABSIZE -2 and myrposc != 1 and myrposc != LABSIZE-2):\n            while lab[myrposr, myrposc] == 1 or (myrposr == CTR and myrposc == CTR):\n                myrposr = np.random.randint(1, LABSIZE - 1)\n                myrposc = np.random.randint(1, LABSIZE - 1)\n            rposr[nb] = myrposr; rposc[nb] = myrposc\n            #print(\"Reward pos:\", rposr, rposc)\n            # Agent always starts an episode from the center\n            posc[nb] = CTR\n            posr[nb] = CTR\n\n        optimizer.zero_grad()\n        loss = 0\n        lossv = 0\n        hidden = net.initialZeroState()\n        hebb = net.initialZeroHebb()\n        et = net.initialZeroHebb() # Eligibility Trace is identical to Hebbian Trace in shape\n        pw = net.initialZeroPlasticWeights()\n        numactionchosen = 0\n\n\n        reward = np.zeros(BATCHSIZE)\n        sumreward = np.zeros(BATCHSIZE)\n        rewards = []\n        vs = []\n        logprobs = []\n        dist = 0\n        numactionschosen = np.zeros(BATCHSIZE, dtype='int32')\n\n        #reloctime = np.random.randint(params['eplen'] // 4, (3 * params['eplen']) // 4)\n\n        #print(\"EPISODE \", numiter)\n        for numstep in range(params['eplen']):\n\n\n\n            ## We randomly relocate the reward halfway through\n            #if numstep == reloctime:\n            #    rposr = 0; rposc = 0\n            #    while lab[rposr, rposc] == 1 or (rposr == CTR and rposc == CTR):\n            #        rposr = np.random.randint(1, LABSIZE - 1)\n            #        rposc = np.random.randint(1, LABSIZE - 1)\n\n\n            inputs = np.zeros((BATCHSIZE, TOTALNBINPUTS), dtype='float32')\n\n            labg = lab.copy()\n            #labg[rposr, rposc] = -1  # The agent can see the reward if it falls within its RF\n            for nb in range(BATCHSIZE):\n                inputs[nb, 0:RFSIZE * RFSIZE] = labg[posr[nb] - RFSIZE//2:posr[nb] + RFSIZE//2 +1, posc[nb] - RFSIZE //2:posc[nb] + RFSIZE//2 +1].flatten() * 1.0\n\n                # Previous chosen action\n                inputs[nb, RFSIZE * RFSIZE +1] = 1.0 # Bias neuron\n                inputs[nb, RFSIZE * RFSIZE +2] = numstep / params['eplen']\n                #inputs[0, RFSIZE * RFSIZE +3] = 1.0 * reward # Reward from previous time step\n                inputs[nb, RFSIZE * RFSIZE +3] = 1.0 * reward[nb]\n                inputs[nb, RFSIZE * RFSIZE + ADDINPUT + numactionschosen[nb]] = 1\n                #inputs = 100.0 * inputs  # input boosting : Very bad with clamp=0\n\n            inputsC = torch.from_numpy(inputs).cuda()\n            # Might be better:\n            #if rposr == posr and rposc = posc:\n            #    inputs[0][-4] = 100.0\n            #else:\n            #    inputs[0][-4] = 0\n\n            # Running the network\n\n            ## Running the network\n            y, v, hidden, hebb, et, pw = net(Variable(inputsC, requires_grad=False), hidden, hebb, et, pw)  # y  should output raw scores, not probas\n\n            # For now:\n            #numactionchosen = np.argmax(y.data[0])\n            # But wait, this is bad, because the network needs to see the\n            # reward signal to guide its own (within-episode) learning... and\n            # argmax might not provide enough exploration for this!\n\n            #ee = np.exp(y.data[0].cpu().numpy())\n            #numactionchosen = np.random.choice(NBNONRESTACTIONS, p = ee / (1e-10 + np.sum(ee)))\n\n            y = F.softmax(y, dim=1)\n            # Must convert y to probas to use this !\n            distrib = torch.distributions.Categorical(y)\n            actionschosen = distrib.sample()\n            logprobs.append(distrib.log_prob(actionschosen))\n            numactionschosen = actionschosen.data.cpu().numpy()    # Turn to scalar\n            reward = np.zeros(BATCHSIZE, dtype='float32')\n            #if numiter == 115 and numstep == 99: identical\n            #if numiter == 125 and numstep == 99: diff\n            #if numiter == 120 and numstep == 99: identical\n            #if numiter == 122 and numstep == 99:  diff\n            #if numiter == 121 and numstep == 99: identical\n            #if numiter == 122 and numstep == 14:  diff (a little, ~1e-3)\n            #if numiter == 122 and numstep == 11:  diff (2e-2), rposr,rposc identical, posr different (5 vs 9 for batch)\n            #if numiter == 122 and numstep == 10:  identical, rposr 5, rposc 6, posr 5, posc 6 for both\n            ####\n            #if numiter == 730 and numstep == 12: diff\n            #if numiter == 700 and numstep == 12:  # diff (not by much.. in the y)\n            #if numiter == 600 and numstep == 12:  # identical ... or so I thought?\n            #if numiter == 650 and numstep == 12:  # diff (1e-6)\n            #if numiter == 625 and numstep == 12:  # diff (1e-5)\n            #if numiter == 612 and numstep == 12:  # diff (1e-6)\n            #if numiter == 606 and numstep == 12:  # diff (1e-7)\n            #if numiter == 603 and numstep == 12:  # diff (1e-6)\n            #if numiter == 601 and numstep == 12:  # diff\n            #if numiter == 600 and numstep == 99: # diff\n            #if numiter == 600 and numstep == 15: #diff\n            #if numiter == 600 and numstep == 1: #diff\n            #if numiter == 500 and numstep == 1: diff\n            #if numiter == 152 and numstep == 1: identical\n            #if numiter == 352 and numstep == 1: # diff\n            #if numiter == 252 and numstep == 1:  # identical!\n            #if numiter == 302 and numstep == 1:  # identical!\n            #if numiter == 332 and numstep == 1:   # diff\n            #if numiter == 316 and numstep == 1:   # diff\n            #if numiter == 309 and numstep == 1:  # diff\n            #if numiter == 304 and numstep == 1:   # identical\n            #if numiter == 306 and numstep == 1:   # identical\n            #if numiter == 308 and numstep == 1:  # diff\n            #if numiter == 307 and numstep == 1:  # diff\n            #if numiter == 306 and numstep == 51:  # diff\n            #if numiter == 306 and numstep == 21:  # diff\n            #if numiter == 306 and numstep == 1:  # identical (confirm)\n            #if numiter == 306 and numstep == 5:  #diff\n            #if numiter == 306 and numstep == 3:  # diff\n            #if numiter == 306 and numstep == 2:  # identical, rposc rposr  3,4, posc posr 5, 3... hebb noticeably diff! 1e-6; alpha/w identical, h2o(hidden) identical\n            #if numiter == 306 and numstep == 1:  h2da(hidden) identical, but h2v(hidden) different! h2o(hidden) identical, hebb different... h2v has identical weights+biases though! hidden identical...\n            # wait, hidden NOT identical - pow(2).sum gives exact same result to 36 decimals, but hidden[0,25] does not!\n            #if numiter == 305 and numstep == 1:  # lol, hidden[0,2] is different....\n            #if numiter == 150 and numstep == 99:  # hidden[0,2] different, event though abss().sum(): identical\n            #if numiter == 99 and numstep == 99:   # hidden[0,2] identical...\n            #if numiter == 101 and numstep == 99:   # various components of hidden identical\n            #if numiter == 221 and numstep == 99:   # hidden different; the difference seems to be caused by loss.backward/optimizer.step.. and disappears if lossv is commented out?!  blossv=0 also removes it! vs[15][0] also different (with blossv=0, so no diff in hidden, and no diff in h2v either!) vs[0][0] identical, vs[1][0] different.....(by ~1e-8) again with blossv=0... if I try with normal blossv but preventing loss.backward/optimizer.step, then I get identical vs[2][0]/vs[-1][0]... if I comment out blossv*lossv addition: hidden identical, vs[-1][0] different, h2DA identical, h2v dot hidden.t() identical... v is identical but the vs are different, how can that be?? if I put the set_trace just after vs.append(v), v[-1][0] is identical, but v[2][0] is not.. and neither is vs[-2][0] !!\n            #if numiter == 221 and numstep == 98, interup just after vs.append(v), blossv=0: now vs[-1][0] also different, as is v... hidden is different too! Confirmed that if you stop at 99 they are identical (How!!!)\n\n            #if numiter == 121 and numstep == 98: # identical\n            #if numiter == 151 and numstep == 98: # identical\n            #if numiter == 191 and numstep == 98: #identical\n            #if numiter == 208 and numstep == 98:\n            #if numiter == 215 and numstep == 98:\n            #if numiter == 218 and numstep == 98: # all identical\n            #if numiter == 220 and numstep == 98: #h identical, but vs[-]1[0] different!\n            #if numiter == 218 and numstep == 98: #h identical, including vs[-1][0]... but not vs[-21][0] ! vs[2][0] identical though! Lol, all vs identical except vs[-21][0]...\n            #if numiter == 218 and numstep == 77: #h identical,but v different (vs[-1][0] identical. as expect, net.h2v(hidden) different, h2v.weight dot hidden different... but h2v weight/bias have identical abs sum, and so does hidden! torch.matmul(hidden[0,0:14] , net.h2v.weight[0,0:14]) identical, but :15 different!  torch.sum(hidden[0,0:15] - net.h2v.weight[0,0:15]) identical... but if you replace - with * or +, different! hidden[0,14] is different! lol, hidden.sum() and hidden.abs().sum() are identical,  hidden[0,0:].sum()/abs().sum() identical, but hidden[0,0:24].sum() is different! hidden[0,24:].sum() is identical too... BASICALLY, the w's and alphas have several differences in the 1e-9 range; the h2v don't\n\n            #if numiter == 120 and numstep == 98: # w's are already different...\n            #if numiter == 101 and numstep == 98: #w's identical\n            #if numiter == 102 and numstep == 98: #w's identical\n            #if numiter == 103 and numstep == 98: # # w's differ in the 1e-10 range \n            #    pdb.set_trace()\n\n            # torch.set_printoptions(precision=30)\n            # np.savetxt('a2.txt', all_losses_objective)\n            # p \"{:.36f}\".format(hidden.abs().sum().data.cpu().numpy()[0])  # Can also give identical results despite some different components??\n            # BAD - may erase too small differences in individual components (bc of squaring) p \"{:.36f}\".format(net.h2DA(hidden).pow(2).sum().data.cpu().numpy()[0])\n            # p \"{:.36f}\".format(hidden[0,2].data.cpu().numpy()[0])\n            # p \"{:.36f}\".format(vs[-1][0].data.cpu().numpy()[0])\n\n            for nb in range(BATCHSIZE):\n                myreward = 0\n                numactionchosen = numactionschosen[nb]\n\n                tgtposc = posc[nb]\n                tgtposr = posr[nb]\n                if numactionchosen == 0:  # Up\n                    tgtposr -= 1\n                elif numactionchosen == 1:  # Down\n                    tgtposr += 1\n                elif numactionchosen == 2:  # Left\n                    tgtposc -= 1\n                elif numactionchosen == 3:  # Right\n                    tgtposc += 1\n                else:\n                    raise ValueError(\"Wrong Action\")\n\n                reward[nb] = 0.0  # The reward for this step\n                if lab[tgtposr][tgtposc] == 1:\n                    reward[nb] -= params['wp']\n                else:\n                    #dist += 1\n                    posc[nb] = tgtposc\n                    posr[nb] = tgtposr\n\n                # Did we hit the reward location ? Increase reward and teleport!\n                # Note that it doesn't matter if we teleport onto the reward, since reward hitting is only evaluated after the (obligatory) move\n                if rposr[nb] == posr[nb] and rposc[nb] == posc[nb]:\n                    reward[nb] += params['rew']\n                    posr[nb]= np.random.randint(1, LABSIZE - 1)\n                    posc[nb] = np.random.randint(1, LABSIZE - 1)\n                    while lab[posr[nb], posc[nb]] == 1 or (rposr[nb] == posr[nb] and rposc[nb] == posc[nb]):\n                        posr[nb] = np.random.randint(1, LABSIZE - 1)\n                        posc[nb] = np.random.randint(1, LABSIZE - 1)\n\n            rewards.append(reward)\n            vs.append(v)\n            sumreward += reward\n\n            #loss += ( params['bent'] * y.pow(2).sum() / BATCHSIZE )  # We want to penalize concentration, i.e. encourage diversity; our version of PyTorch does not have an entropy() function for Distribution. Note: .2 may be too strong, .04 may be too weak.\n            loss +=  params['bent'] * y.pow(2).sum()  # We want to penalize concentration, i.e. encourage diversity; our version of PyTorch does not have an entropy() function for Distribution. Note: .2 may be too strong, .04 may be too weak.\n            #lossentmean  = .99 * lossentmean + .01 * ( params['bent'] * y.pow(2).sum() / BATCHSIZE ).data[0] # We want to penalize concentration, i.e. encourage diversity; our version of PyTorch does not have an entropy() function for Distribution. Note: .2 may be too strong, .04 may be too weak.\n\n\n            if PRINTTRACE:\n                #print(\"Step \", numstep, \"- GI: \", goodinputs, \", GA: \", goodaction, \" Inputs: \", inputsN, \" - Outputs: \", y.data.cpu().numpy(), \" - action chosen: \", numactionchosen,\n                #        \" - inputsthisstep:\", inputsthisstep, \" - mean abs pw: \", np.mean(np.abs(pw.data.cpu().numpy())), \" -Rew: \", reward)\n                print(\"Step \", numstep, \" Inputs (to 1st in batch): \", inputs[0, :TOTALNBINPUTS], \" - Outputs(1st in batch): \", y[0].data.cpu().numpy(), \" - action chosen(1st in batch): \", numactionschosen[0],\n                        \" - mean abs pw: \", np.mean(np.abs(pw.data.cpu().numpy())), \" -Reward (this step, 1st in batch): \", reward[0])\n\n\n\n        # Episode is done, now let's do the actual computations\n\n\n        #R = Variable(torch.zeros(BATCHSIZE).cuda(), requires_grad=False)\n        R = 0\n        gammaR = params['gr']\n        for numstepb in reversed(range(params['eplen'])) :\n            #R = gammaR * R + Variable(torch.from_numpy(rewards[numstepb]).cuda(), requires_grad=False)\n            #R = gammaR * R + float(rewards[numstepb][0])\n            #ctrR = R - vs[numstepb][0]\n            #lossv += ctrR.pow(2).sum() / BATCHSIZE\n            #loss -= (logprobs[numstepb] * ctrR.detach()).sum() / BATCHSIZE  # Need to check if detach() is OK\n            R = gammaR * R + float(rewards[numstepb][0])\n            lossv += (vs[numstepb][0] - R).pow(2)\n            loss -= logprobs[numstepb] * (R - vs[numstepb].data[0][0])  # Not sure if the \"data\" is needed... put it b/c of worry about weird gradient flows\n\n\n        #elif params['algo'] == 'REI':\n        #    R = sumreward\n        #    baseline = meanrewards[rposr, rposc]\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        loss -= logprobs[numstepb] * (R - baseline)\n        #elif params['algo'] == 'REINOB':\n        #    R = sumreward\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        loss -= logprobs[numstepb] * R\n        #elif params['algo'] == 'REITMP':\n        #    R = 0\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        R = gammaR * R + rewards[numstepb]\n        #        loss -= logprobs[numstepb] * R\n        #elif params['algo'] == 'REITMPB':\n        #    R = 0\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        R = gammaR * R + rewards[numstepb]\n        #        loss -= logprobs[numstepb] * (R - meanrewardstmp[rposr, rposc, numstepb])\n\n        #else:\n        #    raise ValueError(\"Which algo?\")\n\n        #meanrewards[rposr, rposc] = (1.0 - params['nu']) * meanrewards[rposr, rposc] + params['nu'] * sumreward\n        #R = 0\n        #for numstepb in reversed(range(params['eplen'])) :\n        #    R = gammaR * R + rewards[numstepb]\n        #    meanrewardstmp[rposr, rposc, numstepb] = (1.0 - params['nu']) * meanrewardstmp[rposr, rposc, numstepb] + params['nu'] * R\n\n        loss += params['blossv'] * lossv\n        loss /= params['eplen']\n\n        if PRINTTRACE:\n            if True: #params['algo'] == 'A3C':\n                print(\"lossv: \", lossv.data.cpu().numpy()[0])\n            print (\"Total reward for this episode (all):\", sumreward, \"Dist:\", dist)\n\n        #if params['squash'] == 1:\n        #    if sumreward < 0:\n        #        sumreward = -np.sqrt(-sumreward)\n        #    else:\n        #        sumreward = np.sqrt(sumreward)\n        #elif params['squash'] == 0:\n        #    pass\n        #else:\n        #    raise ValueError(\"Incorrect value for squash parameter\")\n\n        #loss *= sumreward\n            \n            \n        #if numiter == 102 :   # w identical, final hidden identical... but loss slightly different! as is lossv (shouldnt matter since its addition to loss is commented out). vs apparently identical\n        #if numiter == 182 :   # identical loss (after fixing the rewards computations)\n        #if numiter == 202 :   # identical loss (but loss_between_saves different, which means some losses different?...)\n        #if numiter == 222 :    # loss is different\n        #if numiter == 33 :    \n        if numiter == 101 :    \n            pdb.set_trace()\n\n        #for p in net.parameters():\n        #    p.grad.data.clamp_(-params['clp'], params['clp'])\n        if numiter > 100:  # Burn-in period for meanrewards\n            loss.backward()\n            optimizer.step()\n\n        #torch.cuda.empty_cache()\n\n        #print(sumreward)\n        lossnum = loss.data[0]\n        lossbetweensaves += lossnum\n\n        all_losses_objective.append(lossnum)\n        all_total_rewards.append(sumreward.mean())\n            #all_losses_v.append(lossv.data[0])\n        #total_loss  += lossnum\n\n\n        if (numiter+1) % params['pe'] == 0:\n\n            np.savetxt('a1.txt', all_losses_objective)\n\n            print(numiter, \"====\")\n            print(\"Mean loss: \", lossbetweensaves / params['pe'])\n            lossbetweensaves = 0\n            print(\"Mean reward (across batch and last\", params['pe'], \"eps.): \", np.sum(all_total_rewards[-params['pe']:])/ params['pe'])\n            #print(\"Mean reward (across batch): \", sumreward.mean())\n            previoustime = nowtime\n            nowtime = time.time()\n            print(\"Time spent on last\", params['pe'], \"iters: \", nowtime - previoustime)\n            if params['type'] == 'plastic' or params['type'] == 'lstmplastic':\n                print(\"ETA: \", net.eta.data.cpu().numpy(), \"alpha[0,1]: \", net.alpha.data.cpu().numpy()[0,1], \"w[0,1]: \", net.w.data.cpu().numpy()[0,1] )\n            elif params['type'] == 'modul':\n                print(\"ETA: \", net.eta.data.cpu().numpy(), \" etaet: \", net.etaet.data.cpu().numpy(), \" mean-abs pw: \", np.mean(np.abs(pw.data.cpu().numpy())))\n            elif params['type'] == 'rnn':\n                print(\"w[0,1]: \", net.w.data.cpu().numpy()[0,1] )\n\n        if (numiter+1) % params['save_every'] == 0:\n            print(\"Saving files...\")\n#            lossbetweensaves /= params['save_every']\n#            print(\"Average loss over the last\", params['save_every'], \"episodes:\", lossbetweensaves)\n#            print(\"Alternative computation (should be equal):\", np.mean(all_losses_objective[-params['save_every']:]))\n            losslast100 = np.mean(all_losses_objective[-100:])\n            print(\"Average loss over the last 100 episodes:\", losslast100)\n#            # Instability detection; necessary for SELUs, which seem to be divergence-prone\n#            # Note that if we are unlucky enough to have diverged within the last 100 timesteps, this may not save us.\n#            if losslast100 > 2 * lossbetweensavesprev:\n#                print(\"We have diverged ! Restoring last savepoint!\")\n#                net.load_state_dict(torch.load('./torchmodel_'+suffix + '.txt'))\n#            else:\n            print(\"Saving local files...\")\n            #with open('params_'+suffix+'.dat', 'wb') as fo:\n            #        #pickle.dump(net.w.data.cpu().numpy(), fo)\n            #        #pickle.dump(net.alpha.data.cpu().numpy(), fo)\n            #        #pickle.dump(net.eta.data.cpu().numpy(), fo)\n            #        #pickle.dump(all_losses, fo)\n            #        pickle.dump(params, fo)\n            #with open('loss_'+suffix+'.txt', 'w') as thefile:\n            #    for item in all_losses_objective:\n            #            thefile.write(\"%s\\n\" % item)\n            #with open('lossv_'+suffix+'.txt', 'w') as thefile:\n            #    for item in all_losses_v:\n            #            thefile.write(\"%s\\n\" % item)\n            with open('loss_'+suffix+'.txt', 'w') as thefile:\n                for item in all_total_rewards[::10]:\n                        thefile.write(\"%s\\n\" % item)\n            torch.save(net.state_dict(), 'torchmodel_'+suffix+'.dat')\n            with open('params_'+suffix+'.dat', 'wb') as fo:\n                pickle.dump(params, fo)\n            if os.path.isdir('/mnt/share/tmiconi'):\n                print(\"Transferring to NFS storage...\")\n                for fn in ['params_'+suffix+'.dat', 'loss_'+suffix+'.txt', 'torchmodel_'+suffix+'.dat']:\n                    result = os.system(\n                        'cp {} {}'.format(fn, '/mnt/share/tmiconi/modulmaze/'+fn))\n                print(\"Done!\")\n#            lossbetweensavesprev = lossbetweensaves\n#            lossbetweensaves = 0\n#            sys.stdout.flush()\n#            sys.stderr.flush()\n\n\n\nif __name__ == \"__main__\":\n#defaultParams = {\n#    'type' : 'lstm',\n#    'seqlen' : 200,\n#    'hs': 500,\n#    'activ': 'tanh',\n#    'steplr': 10e9,  # By default, no change in the learning rate\n#    'gamma': .5,  # The annealing factor of learning rate decay for Adam\n#    'imagesize': 31,\n#    'nbiter': 30000,\n#    'lr': 1e-4,\n#    'test_every': 10,\n#    'save_every': 3000,\n#    'rngseed':0\n#}\n\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"--rngseed\", type=int, help=\"random seed\", default=0)\n    #parser.add_argument(\"--clamp\", type=float, help=\"maximum (absolute value) gradient for clamping\", default=1000000.0)\n    #parser.add_argument(\"--wp\", type=float, help=\"wall penalty (reward decrement for hitting a wall)\", default=0.1)\n    parser.add_argument(\"--rew\", type=float, help=\"reward value (reward increment for taking correct action after correct stimulus)\", default=1.0)\n    parser.add_argument(\"--wp\", type=float, help=\"penalty for hitting walls\", default=.05)\n    #parser.add_argument(\"--pen\", type=float, help=\"penalty value (reward decrement for taking any non-rest action)\", default=.2)\n    #parser.add_argument(\"--exprew\", type=float, help=\"reward value (reward increment for hitting reward location)\", default=.0)\n    parser.add_argument(\"--bent\", type=float, help=\"coefficient for the entropy reward (really Simpson index concentration measure)\", default=0.03)\n    #parser.add_argument(\"--probarev\", type=float, help=\"probability of reversal (random change) in desired stimulus-response, per time step\", default=0.0)\n    parser.add_argument(\"--blossv\", type=float, help=\"coefficient for value prediction loss\", default=.1)\n    #parser.add_argument(\"--lsize\", type=int, help=\"size of the labyrinth; must be odd\", default=7)\n    #parser.add_argument(\"--rp\", type=int, help=\"whether the reward should be on the periphery\", default=0)\n    #parser.add_argument(\"--squash\", type=int, help=\"squash reward through signed sqrt (1 or 0)\", default=0)\n    #parser.add_argument(\"--nbarms\", type=int, help=\"number of arms\", default=2)\n    #parser.add_argument(\"--nbseq\", type=int, help=\"number of sequences between reinitializations of hidden/Hebbian state and position\", default=3)\n    #parser.add_argument(\"--activ\", help=\"activ function ('tanh' or 'selu')\", default='tanh')\n    #parser.add_argument(\"--algo\", help=\"meta-learning algorithm (A3C or REI)\", default='A3C')\n    parser.add_argument(\"--rule\", help=\"learning rule ('hebb' or 'oja')\", default='hebb')\n    parser.add_argument(\"--type\", help=\"network type ('lstm' or 'rnn' or 'plastic')\", default='modul')\n    parser.add_argument(\"--msize\", type=int, help=\"size of the maze; must be odd\", default=9)\n    parser.add_argument(\"--da\", help=\"transformation function of DA signal (tanh or sig or lin)\", default='tanh')\n    parser.add_argument(\"--gr\", type=float, help=\"gammaR: discounting factor for rewards\", default=.9)\n    parser.add_argument(\"--lr\", type=float, help=\"learning rate (Adam optimizer)\", default=1e-4)\n    parser.add_argument(\"--fm\", type=int, help=\"if using neuromodulation, do we modulate the whole network (1) or just half (0) ?\", default=1)\n    #parser.add_argument(\"--nu\", type=float, help=\"REINFORCE baseline time constant\", default=.1)\n    #parser.add_argument(\"--samestep\", type=int, help=\"compare stimulus and response in the same step (1) or from successive steps (0) ?\", default=0)\n    #parser.add_argument(\"--nbin\", type=int, help=\"number of possible inputs stimulis\", default=4)\n    #parser.add_argument(\"--modhalf\", type=int, help=\"which half of the recurrent netowkr receives modulation (1 or 2)\", default=1)\n    #parser.add_argument(\"--nbac\", type=int, help=\"number of possible non-rest actions\", default=4)\n    parser.add_argument(\"--rsp\", type=int, help=\"does the agent start each episode from random position (1) or center (0) ?\", default=1)\n    parser.add_argument(\"--addpw\", type=int, help=\"are plastic weights purely additive (1) or forgetting (0) ?\", default=1)\n    #parser.add_argument(\"--clp\", type=int, help=\"inputs clamped (1), fully clamped (2) or through linear layer (0) ?\", default=0)\n    #parser.add_argument(\"--md\", type=int, help=\"maximum delay for reward reception\", default=0)\n    parser.add_argument(\"--eplen\", type=int, help=\"length of episodes\", default=100)\n    #parser.add_argument(\"--exptime\", type=int, help=\"exploration (no reward) time (must be < eplen)\", default=0)\n    parser.add_argument(\"--hs\", type=int, help=\"size of the recurrent (hidden) layer\", default=100)\n    parser.add_argument(\"--bs\", type=int, help=\"batch size\", default=1)\n    parser.add_argument(\"--l2\", type=float, help=\"coefficient of L2 norm (weight decay)\", default=3e-6)\n    #parser.add_argument(\"--steplr\", type=int, help=\"duration of each step in the learning rate annealing schedule\", default=100000000)\n    #parser.add_argument(\"--gamma\", type=float, help=\"learning rate annealing factor\", default=0.3)\n    parser.add_argument(\"--nbiter\", type=int, help=\"number of learning cycles\", default=1000000)\n    parser.add_argument(\"--save_every\", type=int, help=\"number of cycles between successive save points\", default=1000)\n    parser.add_argument(\"--pe\", type=int, help=\"number of cycles between successive printing of information\", default=100)\n    #parser.add_argument(\"--\", type=int, help=\"\", default=1e-4)\n    args = parser.parse_args(); argvars = vars(args); argdict =  { k : argvars[k] for k in argvars if argvars[k] != None }\n    #train()\n    train(argdict)\n\n"
  },
  {
    "path": "maze/testnobatch.py",
    "content": "import argparse\nimport pdb\nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\nimport numpy as np\nfrom numpy import random\nimport torch.nn.functional as F\nfrom torch import optim\nfrom torch.optim import lr_scheduler\nimport random\nimport sys\nimport pickle\nimport time\nimport os\nimport platform\n#import makemaze\n\nimport numpy as np\n#import matplotlib.pyplot as plt\nimport glob\n\n\n\n\n\nnp.set_printoptions(precision=4)\n\nNBDA = 1\n\n\nnp.set_printoptions(precision=4)\n\n\nADDINPUT = 4 # 1 inputs for the previous reward, 1 inputs for numstep, 1 unused,  1 \"Bias\" inputs\n\nNBACTIONS = 4  # U, D, L, R\n\nRFSIZE = 3 # Receptive Field\n\nTOTALNBINPUTS =  RFSIZE * RFSIZE + ADDINPUT + NBACTIONS\n\n##ttype = torch.FloatTensor;\n#ttype = torch.cuda.FloatTensor;\n\n##ttype = torch.FloatTensor;\n#ttype = torch.cuda.FloatTensor;\n\n\nclass Network(nn.Module):\n    def __init__(self, params):\n        super(Network, self).__init__()\n        self.rule = params['rule']\n        self.type = params['type']\n        self.softmax= torch.nn.functional.softmax\n        #if params['activ'] == 'tanh':\n        self.activ = F.tanh\n        #elif params['activ'] == 'selu':\n        #    self.activ = F.selu\n        #else:\n        #    raise ValueError('Must choose an activ function')\n        if params['type'] == 'lstm':\n            self.lstm = torch.nn.LSTM(TOTALNBINPUTS, params['hs']).cuda()\n        elif params['type'] == 'rnn':\n            self.i2h = torch.nn.Linear(TOTALNBINPUTS, params['hs']).cuda()\n            self.w =  torch.nn.Parameter((.01 * torch.rand(params['hs'], params['hs'])).cuda(), requires_grad=True)\n            #self.inputnegmask = Variable(torch.ones(1, params['hs']), requires_grad=False).cuda()\n            #self.inputnegmask[0, :TOTALNBINPUTS] = 0   # no modulation for 2nd half\n        elif params['type'] == 'modplast' or params['type'] == 'modplast2':\n            self.i2h = torch.nn.Linear(TOTALNBINPUTS, params['hs']).cuda()\n            self.w =  torch.nn.Parameter((.01 * torch.rand(params['hs'], params['hs'])).cuda(), requires_grad=True)\n            self.alpha =  torch.nn.Parameter((.01 * torch.rand(params['hs'], params['hs'])).cuda(), requires_grad=True)\n            self.eta = torch.nn.Parameter((.01 * torch.ones(1)).cuda(), requires_grad=True)  # Everyone has the same eta\n            #self.inputnegmask = Variable(torch.ones(1, params['hs']), requires_grad=False).cuda()\n            #self.inputnegmask[0, :TOTALNBINPUTS] = 0   # no modulation for 2nd half\n            self.h2DA = torch.nn.Linear(params['hs'], NBDA).cuda()\n            self.DAoutV = Variable(torch.ones(1, params['hs']), requires_grad=False).cuda()\n        elif params['type'] == 'plastic' :\n            self.i2h = torch.nn.Linear(TOTALNBINPUTS, params['hs']).cuda()\n            self.w =  torch.nn.Parameter((.01 * torch.rand(params['hs'], params['hs'])).cuda(), requires_grad=True)\n            self.alpha =  torch.nn.Parameter((.01 * torch.rand(params['hs'], params['hs'])).cuda(), requires_grad=True)\n            self.eta = torch.nn.Parameter((.01 * torch.ones(1)).cuda(), requires_grad=True)  # Everyone has the same eta\n            #self.inputnegmask = Variable(torch.ones(1, params['hs']), requires_grad=False).cuda()\n            #self.inputnegmask[0, :TOTALNBINPUTS] = 0   # no modulation for 2nd half\n        elif params['type'] == 'modul' or params['type'] == 'modul2':\n            self.i2h = torch.nn.Linear(TOTALNBINPUTS, params['hs']).cuda()\n            self.w =  torch.nn.Parameter((.01 * torch.rand(params['hs'], params['hs'])).cuda(), requires_grad=True)\n            self.alpha =  torch.nn.Parameter((.01 * torch.rand(params['hs'], params['hs'])).cuda(), requires_grad=True)\n            # Note that initial eta is higher (faster) thanbefore\n            self.eta = torch.nn.Parameter((.1 * torch.ones(1)).cuda(), requires_grad=True)  # Everyone has the same eta\n            self.etaet = torch.nn.Parameter((.1 * torch.ones(1)).cuda(), requires_grad=True)  # Everyone has the same etapw\n            self.etapw = torch.nn.Parameter((.1 * torch.ones(1)).cuda(), requires_grad=True)  # Everyone has the same etapw\n            self.h2DA = torch.nn.Linear(params['hs'], NBDA).cuda()\n            # The daweights vectors are weight vectors from the DA output neurons to the network hidden (recurrent) neurons\n            #self.daweights0 = Variable(torch.ones(1, params['hs']), requires_grad=False).cuda()\n            #self.daweights0[0, (params['hs'] // 2):] = 0   # no modulation for 2nd half\n            #self.inputnegmask = Variable(torch.ones(1, params['hs']), requires_grad=False).cuda()\n            #self.inputnegmask[0, :TOTALNBINPUTS] = 0   # no modulation for 2nd half\n\n            #else:\n            #    raise ValueError(\"Must specify which half of the network receives modulation\")\n            self.daweights1 = Variable(torch.ones(1, params['hs']), requires_grad=False).cuda()\n            self.daweights1[0, :(params['hs'] // 4)] = 0\n            self.daweights1[0, -(params['hs'] // 4):] = 0\n        elif params['type'] == 'lstmplastic':\n            self.h2f = torch.nn.Linear(params['hs'], params['hs']).cuda()\n            self.h2i = torch.nn.Linear(params['hs'], params['hs']).cuda()\n            self.h2opt = torch.nn.Linear(params['hs'], params['hs']).cuda()\n\n            # Plasticity in the recurrent connections, h to c:\n            #self.h2c = torch.nn.Linear(params['hs'], params['hs']).cuda()\n            self.w =  torch.nn.Parameter((.01 * torch.rand(params['hs'], params['hs'])).cuda(), requires_grad=True)\n            self.alpha =  torch.nn.Parameter((.01 * torch.rand(params['hs'], params['hs'])).cuda(), requires_grad=True)\n            self.eta = torch.nn.Parameter((.01 * torch.ones(1)).cuda(), requires_grad=True)  # Everyone has the same eta\n\n            self.x2f = torch.nn.Linear(TOTALNBINPUTS, params['hs']).cuda()\n            self.x2opt = torch.nn.Linear(TOTALNBINPUTS, params['hs']).cuda()\n            self.x2i = torch.nn.Linear(TOTALNBINPUTS, params['hs']).cuda()\n            self.x2c = torch.nn.Linear(TOTALNBINPUTS, params['hs']).cuda()\n        elif params['type'] == 'lstmmanual':\n            self.h2f = torch.nn.Linear(params['hs'], params['hs']).cuda()\n            self.h2i = torch.nn.Linear(params['hs'], params['hs']).cuda()\n            self.h2opt = torch.nn.Linear(params['hs'], params['hs']).cuda()\n            self.h2c = torch.nn.Linear(params['hs'], params['hs']).cuda()\n            self.x2f = torch.nn.Linear(TOTALNBINPUTS, params['hs']).cuda()\n            self.x2opt = torch.nn.Linear(TOTALNBINPUTS, params['hs']).cuda()\n            self.x2i = torch.nn.Linear(TOTALNBINPUTS, params['hs']).cuda()\n            self.x2c = torch.nn.Linear(TOTALNBINPUTS, params['hs']).cuda()\n            ##fgt = F.sigmoid(self.x2f(inputs) + self.h2f(hidden[0]))\n            ##ipt = F.sigmoid(self.x2i(inputs) + self.h2i(hidden[0]))\n            ##opt = F.sigmoid(self.x2o(inputs) + self.h2o(hidden[0]))\n            ##cell = torch.mul(fgt, hidden[1]) + torch.mul(ipt, F.tanh(self.x2c(inputs) + self.h2c(hidden[0])))\n            ##h = torch.mul(opt, cell)\n            ##hidden = (h, cell)\n        else:\n            raise ValueError(\"Which network type?\")\n        self.h2o = torch.nn.Linear(params['hs'], NBACTIONS).cuda()\n        self.h2v = torch.nn.Linear(params['hs'], 1).cuda()\n        self.params = params\n\n        # Notice that the vectors are row vectors, and the matrices are transposed wrt the usual order, following apparent pytorch conventions\n        # Each *column* of w targets a single output neuron\n\n    def forward(self, inputs, hidden, hebb, et, pw):\n        if self.type == 'lstm':\n            hactiv, hidden = self.lstm(inputs.view(1, 1, -1), hidden)  # hactiv is just the h. hidden is the h and the cell state, in a tuple\n            hactiv = hactiv[0]\n            activout = self.softmax(self.h2o(hactiv))\n            valueout = self.h2v(hactiv)\n            #pdb.set_trace()\n            #hactiv = hactiv.view(1, -1)  # Apparently this was causing memory leaks?.....\n\n        # Draft for a \"manual\" lstm:\n        elif self.type== 'lstmmanual':\n            # hidden[0] is the previous h state. hidden[1] is the previous c state\n            fgt = F.sigmoid(self.x2f(inputs) + self.h2f(hidden[0]))\n            ipt = F.sigmoid(self.x2i(inputs) + self.h2i(hidden[0]))\n            opt = F.sigmoid(self.x2opt(inputs) + self.h2opt(hidden[0]))\n            cell = torch.mul(fgt, hidden[1]) + torch.mul(ipt, F.tanh(self.x2c(inputs) + self.h2c(hidden[0])))\n            hactiv = torch.mul(opt, F.tanh(cell))\n            #pdb.set_trace()\n            hidden = (hactiv, cell)\n            activout = self.softmax(self.h2o(hactiv))\n            valueout = self.h2v(hactiv)\n            if np.isnan(np.sum(hactiv.data.cpu().numpy())) or np.isnan(np.sum(hidden[1].data.cpu().numpy())) :\n                raise ValueError(\"Nan detected !\")\n\n        elif self.type== 'lstmplastic':\n            fgt = F.sigmoid(self.x2f(inputs) + self.h2f(hidden[0]))\n            ipt = F.sigmoid(self.x2i(inputs) + self.h2i(hidden[0]))\n            opt = F.sigmoid(self.x2opt(inputs) + self.h2opt(hidden[0]))\n            #cell = torch.mul(fgt, hidden[1]) + torch.mul(ipt, F.tanh(self.x2c(inputs) + self.h2c(hidden[0])))\n\n            #Need to think what the inputs and outputs should be for the\n            #plasticity. It might be worth introducing an additional stage\n            #consisting of whatever is multiplied by ift and then added to the\n            #cell state, rather than the full cell state.... But we can\n            #experiment both!\n            #cell = torch.mul(fgt, hidden[1]) + torch.mul(ipt, F.tanh(self.x2c(inputs) + hidden[0].mm(self.w + torch.mul(self.alpha, hebb)))) #  self.h2c(hidden[0])))\n            inputstocell =  F.tanh(self.x2c(inputs) + hidden[0].mm(self.w + torch.mul(self.alpha, hebb)))\n            cell = torch.mul(fgt, hidden[1]) + torch.mul(ipt, inputstocell) #  self.h2c(hidden[0])))\n\n            if self.rule == 'hebb':\n                raise ValueError(\"Not yet implemented!\")\n                #hebb = (1 - self.eta) * hebb + self.eta * torch.bmm(hidden[0].unsqueeze(2), cell.unsqueeze(1))[0]\n                hebb = (1 - self.eta) * hebb + self.eta * torch.bmm(hidden[0].unsqueeze(2), inputstocell.unsqueeze(1))[0]\n            elif self.rule == 'oja':\n                raise ValueError(\"Not yet implemented!\")\n                # NOTE: NOT SURE ABOUT THE OJA VERSION !!\n                hebb = hebb + self.eta * torch.mul((hidden[0][0].unsqueeze(1) - torch.mul(hebb , inputstocell[0].unsqueeze(0))) , inputstocell[0].unsqueeze(0))  # Oja's rule. Remember that yin, yout are row vectors (dim (1,N)). Also, broadcasting!\n                #hebb = hebb + self.eta * torch.mul((hidden[0].unsqueeze(1) - torch.mul(hebb , hactiv[0].unsqueeze(0))) , hactiv[0].unsqueeze(0))  # Oja's rule. Remember that yin, yout are row vectors (dim (1,N)). Also, broadcasting!\n            hactiv = torch.mul(opt, F.tanh(cell))\n            #pdb.set_trace()\n            hidden = (hactiv, cell)\n            if np.isnan(np.sum(hactiv.data.cpu().numpy())) or np.isnan(np.sum(hidden[1].data.cpu().numpy())) :\n                raise ValueError(\"Nan detected !\")\n            activout = self.softmax(self.h2o(hactiv))\n            valueout = self.h2v(hactiv)\n\n\n\n\n\n        elif self.type == 'rnn':\n            if self.params['clp'] == 0:\n                hactiv = self.activ(self.i2h(inputs) + hidden.mm(self.w))\n            elif self.params['clp'] == 1:\n                hactiv = self.activ(inputs + hidden.mm(self.w))\n            #elif self.params['clp'] == 2:\n            #    hidden = self.inputnegmask.mul(hidden) + inputs\n            #    hactiv = self.activ(hidden.mm(self.w))\n            #    hactiv = self.inputnegmask.mul(hactiv) + inputs\n            hidden = hactiv\n            #activout = self.softmax(self.h2o(hactiv))\n            activout = self.h2o(hactiv)   # Linear!\n            valueout = self.h2v(hactiv)\n            #valueout = 0\n\n        elif self.type == 'plastic_prev':\n            # The columns of w and pw are the inputs weights to a single neuron\n            if self.params['clp'] == 1:\n                hactiv = self.activ(inputs + hidden.mm(self.w + torch.mul(self.alpha, hebb)))\n            elif self.params['clp'] == 0:  # No clamping, input layer\n                hactiv = self.activ(self.i2h(inputs) + hidden.mm(self.w + torch.mul(self.alpha, hebb)))\n            activout = self.h2o(hactiv)  # Pure linear, raw scores - will be softmaxed later\n            valueout = self.h2v(hactiv)\n\n            if self.rule == 'hebb':\n                hebb = (1 - self.eta) * hebb + self.eta * torch.bmm(hidden.unsqueeze(2), hactiv.unsqueeze(1))[0]\n            elif self.rule == 'oja':\n                hebb = hebb + self.eta * torch.mul((hidden[0].unsqueeze(1) - torch.mul(hebb , hactiv[0].unsqueeze(0))) , hactiv[0].unsqueeze(0))  # Oja's rule. Remember that yin, yout are row vectors (dim (1,N)). Also, broadcasting!\n            else:\n                raise ValueError(\"Must specify learning rule ('hebb' or 'oja')\")\n            hidden = hactiv\n\n        elif self.type == 'plastic':\n            # The columns of w and pw are the inputs weights to a single neuron (?)\n            if self.params['clp'] == 1:\n                hactiv = self.activ(inputs + hidden.mm(self.w + torch.mul(self.alpha, hebb)))\n            elif self.params['clp'] == 0:  # No clamping, input layer\n                hactiv = self.activ(self.i2h(inputs) + hidden.mm(self.w + torch.mul(self.alpha, hebb)))\n            activout = self.h2o(hactiv)  # Pure linear, raw scores - will be softmaxed later\n            valueout = self.h2v(hactiv)\n\n            if self.rule == 'hebb':\n                deltahebb = torch.bmm(hidden.unsqueeze(2), hactiv.unsqueeze(1))[0]\n            elif self.rule == 'oja':\n                deltahebb = torch.mul((hidden[0].unsqueeze(1) - torch.mul(hebb , hactiv[0].unsqueeze(0))) , hactiv[0].unsqueeze(0))\n\n            if self.params['addpw'] == 3:\n                # Note that there is no decay, even in the Hebb-rule case : additive only!\n                # Hard clamp\n                hebb = torch.clamp( hebb +  self.eta * deltahebb, min=-1.0, max=1.0)\n            elif self.params['addpw'] == 2:\n                # Note that there is no decay, even in the Hebb-rule case : additive only!\n                # Soft clamp\n                hebb = torch.clamp( hebb +  torch.clamp(self.eta * deltahebb, min=0.0) * (1 - hebb) +  torch.clamp(self.eta * deltahebb, max=0.0) * (hebb + 1) , min=-1.0, max=1.0)\n            elif self.params['addpw'] == 1: # Purely additive, tends to make the meta-learning diverge. No decay/clamp.\n                hebb = hebb + self.eta * deltahebb\n            elif self.params['addpw'] == 0:\n                # We do it the normal way. Note that here, Hebb-rule is decaying.\n                # There is probably a way to make it more efficient.\n                # Note 2: For Oja's rule, there is no difference between addpw 0 and addpw1\n                if self.rule == 'hebb':\n                    hebb = (1 - self.eta) * hebb + self.eta * deltahebb\n                elif self.rule == 'oja':\n                    hebb =  hebb + self.eta  * deltahebb\n\n            hidden = hactiv\n\n        elif self.type == 'modplast2':\n\n            #Here we compute the same deltahebb for the whole network, and use\n            #the same addpw for the whole network too.  #Only difference between\n            #modulated and non-modulated halves is whether eta is the network's\n            #(learned) eta parameter or the neuromodulator output DAout\n\n            # The columns of w and pw are the inputs weights to a single neuron\n            if self.params['clp'] == 1:\n                hactiv = self.activ(inputs + hidden.mm(self.w + torch.mul(self.alpha, hebb)))\n            else:  # No clamping, input layer\n                hactiv = self.activ(self.i2h(inputs) + hidden.mm(self.w + torch.mul(self.alpha, hebb)))\n            activout = self.h2o(hactiv)  # Pure linear, raw scores - will be softmaxed later\n            valueout = self.h2v(hactiv)\n\n            # Now computing the Hebbian updates...\n\n            if self.params['da'] == 'tanh':\n                DAout = F.tanh(self.h2DA(hactiv))\n            elif self.params['da'] == 'sig':\n                DAout = F.sigmoid(self.h2DA(hactiv))\n            elif self.params['da'] == 'lin':\n                DAout =  self.h2DA(hactiv)\n            else:\n                raise ValueError(\"Which transformation for DAout ?\")\n\n            if self.rule == 'hebb':\n                deltahebb = torch.bmm(hidden.unsqueeze(2), hactiv.unsqueeze(1))[0]\n            elif self.rule == 'oja':\n                deltahebb = torch.mul((hidden[0].unsqueeze(1) - torch.mul(hebb , hactiv[0].unsqueeze(0))) , hactiv[0].unsqueeze(0))\n\n            if self.params['addpw'] == 3: # Hard clamp, purely additive\n                # Note that we do the same for Hebb and Oja's rule\n                hebb1 = torch.clamp(hebb + DAout[0,0] * deltahebb, min=-1.0, max=1.0)\n                hebb2 = torch.clamp(hebb + self.eta * deltahebb, min=-1.0, max=1.0)\n            elif self.params['addpw'] == 2:\n                # Note that there is no decay, even in the Hebb-rule case : additive only!\n                hebb1 = torch.clamp( hebb +  torch.clamp(DAout[0,0] * deltahebb, min=0.0) * (1 - hebb) +  torch.clamp(DAout[0,0] * deltahebb, max=0.0) * (hebb + 1) , min=-1.0, max=1.0)\n                hebb2 = torch.clamp( hebb +  torch.clamp(self.eta * deltahebb, min=0.0) * (1 - hebb) +  torch.clamp(self.eta * deltahebb, max=0.0) * (hebb + 1) , min=-1.0, max=1.0)\n            elif self.params['addpw'] == 1: # Purely additive. This will almost certainly diverge, don't use it!\n                hebb1 = hebb + DAout[0,0] * deltahebb\n                hebb2 = hebb + self.eta * deltahebb\n\n            elif self.params['addpw'] == 0:\n                # We do it the normal way. Note that here, Hebb-rule is decaying.\n                # There is probably a way to make it more efficient\n                # Note: This can go awry if DAout can go negative or outside [0,1]!\n                # Note 2: For Oja's rule, there is no difference between addpw 0 and addpw1\n                if self.rule == 'hebb':\n                    hebb1 = (1 - DAout[0,0]) * hebb + DAout[0,0] * deltahebb\n                    hebb2 = (1 - self.eta) * hebb + DAout[0,0] * deltahebb\n                elif self.rule == 'oja':\n                    hebb1=  hebb + DAout[0,0] * deltahebb\n                    hebb2=  hebb + self.eta * deltahebb\n            else:\n                raise ValueError(\"Which additive form for plastic weights?\")\n\n            if self.params['fm'] == 1:\n                hebb = hebb1\n            elif self.params['fm'] == 0:\n                hebb = torch.cat( (hebb1[:, :self.params['hs']//2], hebb2[:, self.params['hs']//2:]), dim=1)\n            else:\n                raise ValueError(\"Must select whether fully modulated or not\")\n\n            hidden = hactiv\n\n\n        elif self.type == 'modplast':\n            # The actual network update should be the same as for \"plastic\". Only the Hebbian updates should be different\n            # The columns of w and pw are the inputs weights to a single neuron\n            if self.params['clp'] == 1:\n                hactiv = self.activ(inputs + hidden.mm(self.w + torch.mul(self.alpha, hebb)))\n            else:  # No clamping, input layer\n                hactiv = self.activ(self.i2h(inputs) + hidden.mm(self.w + torch.mul(self.alpha, hebb)))\n            activout = self.h2o(hactiv)  # Pure linear, raw scores - will be softmaxed later\n            valueout = self.h2v(hactiv)\n\n            # Now computing the Hebbian updates...\n\n            if self.params['da'] == 'tanh':\n                DAout = F.tanh(self.h2DA(hactiv))\n            elif self.params['da'] == 'sig':\n                DAout = F.sigmoid(self.h2DA(hactiv))\n            elif self.params['da'] == 'lin':\n                DAout =  self.h2DA(hactiv)\n            else:\n                raise ValueError(\"Which transformation for DAout ?\")\n\n            if self.rule == 'hebb':\n                deltahebb = torch.bmm(hidden.unsqueeze(2), hactiv.unsqueeze(1))[0]\n            elif self.rule == 'oja':\n                deltahebb = torch.mul((hidden[0].unsqueeze(1) - torch.mul(hebb , hactiv[0].unsqueeze(0))) , hactiv[0].unsqueeze(0))\n\n            if self.params['addpw'] == 3: # Hard clamp, purely additive\n                # Note that we do the same for Hebb and Oja's rule\n                hebb1 = torch.clamp(hebb + DAout[0,0] * deltahebb, min=-1.0, max=1.0)\n            elif self.params['addpw'] == 2:\n                # Note that there is no decay, even in the Hebb-rule case : additive only!\n                hebb1 = torch.clamp( hebb +  torch.clamp(DAout[0,0] * deltahebb, min=0.0) * (1 - hebb) +  torch.clamp(DAout[0,0] * deltahebb, max=0.0) * (hebb + 1) , min=-1.0, max=1.0)\n            elif self.params['addpw'] == 1: # Purely additive, tends to make the meta-learning diverge\n                # Note that we do the same for Hebb and Oja's rule\n                hebb1 = hebb + DAout[0,0] * deltahebb\n\n            elif self.params['addpw'] == 0:\n                # We do it the normal way. Note that here, Hebb-rule is decaying.\n                # There is probably a way to make it more efficient by grouping it with the computation of the other (non-modulated) half.\n                # NOTE: This can go awry if DAout can go negative!\n                # Note 2: For Oja's rule, there is no difference between addpw 0 and addpw1\n                if self.rule == 'hebb':\n                    hebb1 = (1 - DAout[0,0]) * hebb + DAout[0,0] * deltahebb\n                elif self.rule == 'oja':\n                    hebb1=  hebb + DAout[0,0] * deltahebb\n            else:\n                raise ValueError(\"Which additive form for plastic weights?\")\n\n            # The non-neuromodulated half of the network just does standard plasticity, using learned self.eta.\n            if self.rule == 'hebb':\n                hebb2 = (1 - self.eta) * hebb + self.eta * deltahebb\n            elif self.rule == 'oja':\n                hebb2 = hebb + self.eta * deltahebb\n            else:\n                raise ValueError(\"Must specify learning rule ('hebb' or 'oja')\")\n\n            if self.params['fm'] == 1:\n                hebb = hebb1\n            elif self.params['fm'] == 0:\n                hebb = torch.cat( (hebb1[:, :self.params['hs']//2], hebb2[:, self.params['hs']//2:]), dim=1)\n            else:\n                raise ValueError(\"Must select whether fully modulated or not\")\n\n            hidden = hactiv\n\n\n        elif self.type == 'modul':\n\n            # One half of the network receives neuromodulation. The other just\n            # does plain Hebbian plasticity; note that the eta's for the\n            # Hebbian trace and the eligibility trace are different\n\n            # We need to select the order of operations; network update, hebb update, neuromodulated incorporation into stable plastic weights\n            # One possibility (for now go with this one):\n            #    - computing all outputs from current inputs, including DA\n            #    - incorporating neuromodulated Hebb/eligibility trace into plastic weights\n            #    - computing updated hebb\n            # Another possibility:\n            #    - computing all outputs from current inputs, including DA\n            #    - computing updated Hebb\n            #    - incorporating this modified Hebb into plastic weights through neuromodulation\n\n            # The columns of w and pw are the inputs weights to a single neuron\n            if self.params['clp'] == 0:\n                hactiv = self.activ(self.i2h(inputs) + hidden.mm(self.w + torch.mul(self.alpha, pw)))\n            elif self.params['clp'] == 1:\n                hactiv = self.activ(inputs + hidden.mm(self.w + torch.mul(self.alpha, pw)))\n            #else:\n            activout = self.h2o(hactiv)  # Pure linear, raw scores - will be softmaxed later\n            valueout = self.h2v(hactiv)\n            #valueout = 0\n            if self.params['da'] == 'tanh':\n                DAout = F.tanh(self.h2DA(hactiv))\n            elif self.params['da'] == 'sig':\n                DAout = F.sigmoid(self.h2DA(hactiv))\n            elif self.params['da'] == 'lin':\n                DAout =  self.h2DA(hactiv)\n            else:\n                raise ValueError(\"Which transformation for DAout ?\")\n\n            if self.params['addpw'] == 3:\n                # Hard clamp\n                deltapw = DAout[0,0] * et\n                pw1 = torch.clamp(pw + deltapw, min=-1.0, max=1.0)\n            #if self.params['addpw'] == 3:\n            #    # Constrained AND cubed: This makes the soft bounds \"softer\", so the values can come closer to -1 and 1.\n            #    # Absolutely no difference in performance from addpw=2 !\n            #    deltapw = DAout[0,0] * et\n            #    pw1 = torch.clamp( pw +  torch.clamp(deltapw, min=0.0) * (1 - pw ** 3) +  torch.clamp(deltapw, max=0.0) * (pw ** 3 + 1) , min=-1.0, max=1.0)\n            #    #if np.random.rand() < .05:\n            #    #    pdb.set_trace()\n            elif self.params['addpw'] == 2:\n                deltapw = DAout[0,0] * et\n                # This constrains the pw to stay within [-1, 1] (we could do that by putting a tanh on top of it, but we want pw itself to remain within that range to avoid large gradients)\n                # The outer clamp is there for safety. In theory the expression within that clamp is \"softly\" constrained to stay within [-1, 1], but finite-size effects might throw it off.\n                # Note that cubing pw in the boundary terms below would make the bounds \"softer\" and allow a wider range, but in practice it makes no difference in performance.\n                pw1 = torch.clamp( pw +  torch.clamp(deltapw, min=0.0) * (1 - pw) +  torch.clamp(deltapw, max=0.0) * (pw + 1) , min=-.99999, max=.99999)\n                #if np.random.rand() < .05:\n                #    pdb.set_trace()\n            if self.params['addpw'] == 1: # Purely additive, tends to make the meta-learning diverge\n                deltapw = DAout[0,0] * et\n                pw1 = pw + deltapw\n            elif self.params['addpw'] == 0:\n                # Problem: this makes the plastic weights decaying!\n                pw1 = pw - torch.abs(self.etapw) * pw + self.etapw * DAout[0,0] * et\n\n            # Should we have a fully neuromodulated network, or only half?\n            if self.params['fm'] == 1:\n                pw = pw1\n            elif self.params['fm'] == 0:\n                pw = torch.cat( (hebb[:, :self.params['hs']//2], pw1[:, self.params['hs'] // 2:]), dim=1) # Use output argument?\n            else:\n                raise ValueError(\"Must select whether fully modulated or not\")\n\n            # Note that the 'hebb' variable is only for the non-modulated part,\n            # which is only used if params['fm'] == 0; also, hebb can be\n            # updated Oja or decaying, but et is always decaying.\n            if self.rule == 'hebb':\n                deltahebb = torch.bmm(hidden.unsqueeze(2), hactiv.unsqueeze(1))[0]\n                hebb = (1 - self.eta) * hebb + self.eta * deltahebb\n                et = (1 - self.etaet) * et + self.etaet *  deltahebb\n            elif self.rule == 'oja':\n                #raise ValueError(\"Not yet implemented!\")\n                hebb = hebb + self.eta * torch.mul((hidden[0].unsqueeze(1) - torch.mul(hebb , hactiv[0].unsqueeze(0))) , hactiv[0].unsqueeze(0))  # Oja's rule. Remember that yin, yout are row vectors (dim (1,N)). Also, broadcasting!\n                deltahebb = torch.bmm(hidden.unsqueeze(2), hactiv.unsqueeze(1))[0]\n                et = (1 - self.etaet) * et + self.etaet *  deltahebb\n            else:\n                raise ValueError(\"Must specify learning rule ('hebb' or 'oja')\")\n            hidden = hactiv\n            #if np.random.rand() < .05:\n            #   pdb.set_trace()\n\n\n        elif self.type == 'modul2':\n\n            # Here we try the other order:\n            #    - computing all outputs from current inputs, including DA\n            #    - computing updated Hebb\n            #    - incorporating this modified Hebb into plastic weights through neuromodulation\n\n            # The columns of w and pw are the inputs weights to a single neuron\n            if self.params['clp'] == 0:\n                hactiv = self.activ(self.i2h(inputs) + hidden.mm(self.w + torch.mul(self.alpha, pw)))\n            elif self.params['clp'] == 1:\n                hactiv = self.activ(inputs + hidden.mm(self.w + torch.mul(self.alpha, pw)))\n            #else:\n            activout = self.h2o(hactiv)  # Pure linear, raw scores - will be softmaxed later\n            valueout = self.h2v(hactiv)\n            #valueout = 0\n            if self.params['da'] == 'tanh':\n                DAout = F.tanh(self.h2DA(hactiv))\n            elif self.params['da'] == 'sig':\n                DAout = F.sigmoid(self.h2DA(hactiv))\n            elif self.params['da'] == 'lin':\n                DAout =  self.h2DA(hactiv)\n            else:\n                raise ValueError(\"Which transformation for DAout ?\")\n\n            # Updating ET before PW (note: 'hebb' variable is only for the non-modulated part of the network)\n            if self.rule == 'hebb':\n                deltahebb = torch.bmm(hidden.unsqueeze(2), hactiv.unsqueeze(1))[0]\n                hebb = (1 - self.eta) * hebb + self.eta * deltahebb\n                et = (1 - self.etaet) * et + self.etaet *  deltahebb\n            elif self.rule == 'oja':\n                #raise ValueError(\"Not yet implemented!\")\n                hebb = hebb + self.eta * torch.mul((hidden[0].unsqueeze(1) - torch.mul(hebb , hactiv[0].unsqueeze(0))) , hactiv[0].unsqueeze(0))  # Oja's rule. Remember that yin, yout are row vectors (dim (1,N)). Also, broadcasting!\n                deltahebb = torch.bmm(hidden.unsqueeze(2), hactiv.unsqueeze(1))[0]\n                et = (1 - self.etaet) * et + self.etaet *  deltahebb\n            else:\n                raise ValueError(\"Must specify learning rule ('hebb' or 'oja')\")\n\n\n            if self.params['addpw'] == 3:\n                # Hard clamp\n                deltapw = DAout[0,0] * et\n                pw1 = torch.clamp(pw + deltapw, min=-1.0, max=1.0)\n            #if self.params['addpw'] == 3:\n            #    # Constrained AND cubed: This makes the soft bounds \"softer\", so the values can come closer to -1 and 1.\n            #    # Absolutely no difference in performance from addpw=2 !\n            #    deltapw = DAout[0,0] * et\n            #    pw1 = torch.clamp( pw +  torch.clamp(deltapw, min=0.0) * (1 - pw ** 3) +  torch.clamp(deltapw, max=0.0) * (pw ** 3 + 1) , min=-1.0, max=1.0)\n            #    #if np.random.rand() < .05:\n            #    #    pdb.set_trace()\n            elif self.params['addpw'] == 2:\n                deltapw = DAout[0,0] * et\n                # This constrains the pw to stay within [-1, 1] (we could do that by putting a tanh on top of it, but we want pw itself to remain within that range to avoid large gradients)\n                # The outer clamp is there for safety. In theory the expression within that clamp is \"softly\" constrained to stay within [-1, 1], but finite-size effects might throw it off.\n                # Note that cubing pw in the boundary terms below would make the bounds \"softer\" and allow a wider range, but in practice it makes no difference in performance.\n                pw1 = torch.clamp( pw +  torch.clamp(deltapw, min=0.0) * (1 - pw) +  torch.clamp(deltapw, max=0.0) * (pw + 1) , min=-.99999, max=.99999)\n                #if np.random.rand() < .05:\n                #    pdb.set_trace()\n            if self.params['addpw'] == 1: # Purely additive, tends to make the meta-learning diverge\n                deltapw = DAout[0,0] * et\n                pw1 = pw + deltapw\n            elif self.params['addpw'] == 0:\n                # Problem: this makes the plastic weights decaying!\n                pw1 = pw - torch.abs(self.etapw) * pw + self.etapw * DAout[0,0] * et\n\n            # Should we have a fully neuromodulated network, or only half?\n            if self.params['fm'] == 1:\n                pw = pw1\n            elif self.params['fm'] == 0:\n                pw = torch.cat( (hebb[:, :self.params['hs']//2], pw1[:, self.params['hs'] // 2:]), dim=1) # Use output argument?\n            else:\n                raise ValueError(\"Must select whether fully modulated or not\")\n\n            hidden = hactiv\n            #if np.random.rand() < .05:\n            #   pdb.set_trace()\n\n\n        return activout, valueout, hidden, hebb, et, pw\n\n\n\n    def initialZeroHebb(self):\n        return Variable(torch.zeros(self.params['hs'], self.params['hs']) , requires_grad=False).cuda()\n    def initialZeroPlasticWeights(self):\n        return Variable(torch.zeros(self.params['hs'], self.params['hs']) , requires_grad=False).cuda()\n\n    def initialZeroState(self):\n        if self.params['type'] == 'lstm':\n            return (Variable(torch.zeros(1, 1, self.params['hs']), requires_grad=False).cuda() , Variable(torch.zeros(1, 1, self.params['hs']), requires_grad=False ).cuda() )\n        elif self.params['type'] == 'lstmmanual' or self.params['type'] == 'lstmplastic':\n            return (Variable(torch.zeros(1, self.params['hs']), requires_grad=False).cuda() , Variable(torch.zeros(1, self.params['hs']), requires_grad=False ).cuda() )\n        elif self.params['type'] == 'rnn' or self.params['type'] == 'plastic'  or self.params['type'] == 'modul' or self.params['type'] == 'modul2' or self.params['type'] == 'modplast' or self.params['type'] == 'modplast2':\n            return Variable(torch.zeros(1, self.params['hs']), requires_grad=False ).cuda()\n\n\n\ndef train(paramdict):\n    #params = dict(click.get_current_context().params)\n\n    #TOTALNBINPUTS =  RFSIZE * RFSIZE + ADDINPUT + NBNONRESTACTIONS\n    print(\"Starting training...\")\n    params = {}\n    #params.update(defaultParams)\n    params.update(paramdict)\n    print(\"Passed params: \", params)\n    print(platform.uname())\n    #params['nbsteps'] = params['nbshots'] * ((params['prestime'] + params['interpresdelay']) * params['nbclasses']) + params['prestimetest']  # Total number of steps per episode\n    suffix = \"maz_\"+\"\".join([str(x)+\"_\" if pair[0] is not 'nbsteps' and pair[0] is not 'rngseed' and pair[0] is not 'save_every' and pair[0] is not 'test_every' and pair[0] is not 'print_every' else '' for pair in sorted(zip(params.keys(), params.values()), key=lambda x:x[0] ) for x in pair])[:-1] + \"_rngseed_\" + str(params['rngseed'])   # Turning the parameters into a nice suffix for filenames\n\n    # Initialize random seeds (first two redundant?)\n    print(\"Setting random seeds\")\n    np.random.seed(params['rngseed']); random.seed(params['rngseed']); torch.manual_seed(params['rngseed'])\n    #print(click.get_current_context().params)\n\n    print(\"Initializing network\")\n    net = Network(params)\n    print (\"Shape of all optimized parameters:\", [x.size() for x in net.parameters()])\n    allsizes = [torch.numel(x.data.cpu()) for x in net.parameters()]\n    print (\"Size (numel) of all optimized elements:\", allsizes)\n    print (\"Total size (numel) of all optimized elements:\", sum(allsizes))\n\n    #total_loss = 0.0\n    print(\"Initializing optimizer\")\n    optimizer = torch.optim.Adam(net.parameters(), lr=1.0*params['lr'], eps=1e-4, weight_decay=params['l2'])\n    #optimizer = torch.optim.SGD(net.parameters(), lr=1.0*params['lr'])\n    #scheduler = torch.optim.lr_scheduler.StepLR(optimizer, gamma=params['gamma'], step_size=params['steplr'])\n\n    #LABSIZE = params['lsize']\n    #lab = np.ones((LABSIZE, LABSIZE))\n    #CTR = LABSIZE // 2\n\n    # Simple cross maze\n    #lab[CTR, 1:LABSIZE-1] = 0\n    #lab[1:LABSIZE-1, CTR] = 0\n\n\n    # Double-T maze\n    #lab[CTR, 1:LABSIZE-1] = 0\n    #lab[1:LABSIZE-1, 1] = 0\n    #lab[1:LABSIZE-1, LABSIZE - 2] = 0\n\n    # Grid maze\n    #lab[1:LABSIZE-1, 1:LABSIZE-1].fill(0)\n    #for row in range(1, LABSIZE - 1):\n    #    for col in range(1, LABSIZE - 1):\n    #        if row % 2 == 0 and col % 2 == 0:\n    #            lab[row, col] = 1\n    #lab[CTR,CTR] = 0 # Not strictly necessary, but perhaps helps loclization by introducing a detectable irregularity in the center\n\n\n    LABSIZE = params['msize']\n    lab = np.ones((LABSIZE, LABSIZE))\n    CTR = LABSIZE // 2\n\n    # Simple cross maze\n    #lab[CTR, 1:LABSIZE-1] = 0\n    #lab[1:LABSIZE-1, CTR] = 0\n\n\n    # Double-T maze\n    #lab[CTR, 1:LABSIZE-1] = 0\n    #lab[1:LABSIZE-1, 1] = 0\n    #lab[1:LABSIZE-1, LABSIZE - 2] = 0\n\n    # Grid maze\n    lab[1:LABSIZE-1, 1:LABSIZE-1].fill(0)\n    for row in range(1, LABSIZE - 1):\n        for col in range(1, LABSIZE - 1):\n            if row % 2 == 0 and col % 2 == 0:\n                lab[row, col] = 1\n    # Not strictly necessary, but cleaner since we start the agent at the\n    # center for each episode; may help loclization in some maze sizes\n    # (including 13 and 9, but not 11) by introducing a detectable irregularity\n    # in the center:\n    lab[CTR,CTR] = 0\n\n\n\n    all_losses = []\n    all_losses_objective = []\n    all_total_rewards = []\n    all_losses_v = []\n    lossbetweensaves = 0\n    nowtime = time.time()\n    meanrewards = np.zeros((LABSIZE, LABSIZE))\n    meanrewardstmp = np.zeros((LABSIZE, LABSIZE, params['eplen']))\n\n    pos = 0\n    hidden = net.initialZeroState()\n    hebb = net.initialZeroHebb()\n    pw = net.initialZeroPlasticWeights()\n\n    #celoss = torch.nn.CrossEntropyLoss() # For supervised learning - not used here\n\n\n    print(\"Starting episodes!\")\n\n    for numiter in range(params['nbiter']):\n\n        PRINTTRACE = 0\n        #if (numiter+1) % (1 + params['print_every']) == 0:\n        if (numiter+1) % (params['print_every']) == 0:\n            PRINTTRACE = 1\n\n        #lab = makemaze.genmaze(size=LABSIZE, nblines=4)\n        #count = np.zeros((LABSIZE, LABSIZE))\n\n        # Select the reward location for this episode - not on a wall!\n        # And not on the center either! (though not sure how useful that restriction is...)\n        rposr = 0; rposc = 0\n        while lab[rposr, rposc] == 1 or (rposr == CTR and rposc == CTR):\n            rposr = np.random.randint(1, LABSIZE - 1)\n            rposc = np.random.randint(1, LABSIZE - 1)\n\n        # We always start the episode from the center (when hitting reward, we may teleport either to center or to a random location depending on params['rsp'])\n        posc = CTR\n        posr = CTR\n\n        optimizer.zero_grad()\n        loss = 0\n        lossv = 0\n        hidden = net.initialZeroState()\n        hebb = net.initialZeroHebb()\n        et = net.initialZeroHebb() # Eligibility Trace is identical to Hebbian Trace in shape\n        pw = net.initialZeroPlasticWeights()\n        numactionchosen = 0\n\n\n        reward = 0.0\n        rewards = []\n        vs = []\n        logprobs = []\n        sumreward = 0.0\n        dist = 0\n        rewarddelay = -1\n        rewardpercep = 0\n\n        #reloctime = np.random.randint(params['eplen'] // 4, (3 * params['eplen']) // 4)\n\n        #print(\"EPISODE \", numiter)\n        for numstep in range(params['eplen']):\n\n\n\n            ## We randomly relocate the reward halfway through\n            #if numstep == reloctime:\n            #    rposr = 0; rposc = 0\n            #    while lab[rposr, rposc] == 1 or (rposr == CTR and rposc == CTR):\n            #        rposr = np.random.randint(1, LABSIZE - 1)\n            #        rposc = np.random.randint(1, LABSIZE - 1)\n\n\n            if params['clp'] == 0:\n                inputs = np.zeros((1, TOTALNBINPUTS), dtype='float32')\n            else:\n                inputs = np.zeros((1, params['hs']), dtype='float32')\n\n            labg = lab.copy()\n            #labg[rposr, rposc] = -1  # The agent can see the reward if it falls within its RF\n            inputs[0, 0:RFSIZE * RFSIZE] = labg[posr - RFSIZE//2:posr + RFSIZE//2 +1, posc - RFSIZE //2:posc + RFSIZE//2 +1].flatten() * 1.0\n\n            # Previous chosen action\n            inputs[0, RFSIZE * RFSIZE +1] = 1.0 # Bias neuron\n            inputs[0, RFSIZE * RFSIZE +2] = numstep / params['eplen']\n            #inputs[0, RFSIZE * RFSIZE +3] = 1.0 * reward # Reward from previous time step\n            inputs[0, RFSIZE * RFSIZE +3] = 1.0 * rewardpercep\n            inputs[0, RFSIZE * RFSIZE + ADDINPUT + numactionchosen] = 1\n            #inputs = 100.0 * inputs  # input boosting : Very bad with clamp=0\n            inputsC = torch.from_numpy(inputs).cuda()\n            # Might be better:\n            #if rposr == posr and rposc = posc:\n            #    inputs[0][-4] = 100.0\n            #else:\n            #    inputs[0][-4] = 0\n\n            # Running the network\n\n            ## Running the network\n            y, v, hidden, hebb, et, pw = net(Variable(inputsC, requires_grad=False), hidden, hebb, et, pw)  # y  should output raw scores, not probas\n\n            # For now:\n            #numactionchosen = np.argmax(y.data[0])\n            # But wait, this is bad, because the network needs to see the\n            # reward signal to guide its own (within-episode) learning... and\n            # argmax might not provide enough exploration for this!\n\n            #ee = np.exp(y.data[0].cpu().numpy())\n            #numactionchosen = np.random.choice(NBNONRESTACTIONS, p = ee / (1e-10 + np.sum(ee)))\n\n            y = F.softmax(y, dim=1)\n            # Must convert y to probas to use this !\n            distrib = torch.distributions.Categorical(y)\n            actionchosen = distrib.sample()  # sample() returns a Pytorch tensor of size 1; this is needed for the backprop below\n            numactionchosen = actionchosen.data[0]    # Turn to scalar\n\n\n            #if numiter == 103 and numstep == 98: \n            #    pdb.set_trace()\n\n\n            tgtposc = posc\n            tgtposr = posr\n            if numactionchosen == 0:  # Up\n                tgtposr -= 1\n            elif numactionchosen == 1:  # Down\n                tgtposr += 1\n            elif numactionchosen == 2:  # Left\n                tgtposc -= 1\n            elif numactionchosen == 3:  # Right\n                tgtposc += 1\n            else:\n                raise ValueError(\"Wrong Action\")\n\n            reward = 0.0\n            if lab[tgtposr][tgtposc] == 1:\n                # Hit wall!\n                reward = -params['wp']\n            else:\n                dist += 1\n                posc = tgtposc\n                posr = tgtposr\n\n            # Did we hit the reward location ?\n            if rposr == posr and rposc == posc:\n                reward += params['rew']\n                if params['rsp'] == 1:\n                    posr = np.random.randint(1, LABSIZE - 1)\n                    posc = np.random.randint(1, LABSIZE - 1)\n                    while lab[posr, posc] == 1 or (rposr == posr and rposc == posc):\n                        posr = np.random.randint(1, LABSIZE - 1)\n                        posc = np.random.randint(1, LABSIZE - 1)\n                else:\n                    posr = CTR\n                    posc = CTR\n            rewardpercep = reward\n            # This is with reward delay. Not necessarily buggy, but it causes some divergences w/ batch due to the reward counter for not detecting rewards\n            #    if rewarddelay < 0:  # Make sure that the reward delay counter is not active. NOTE: this can cause weirdnesses if e.g. re-teleporting multiple times on the reward location....?\n            #        # If we already have hit the reward location, but haven't\n            #        # perceived it / been transported yet, we don't care if we\n            #        # do it again before the perception (and transportation)\n            #        # has occurred\n            #        reward += params['rew']  # That is the reward that meta-learning cares about - not the one perceived by the agent, which is delayed\n            #        rewarddelay = 1 + np.random.randint(1 + params['md'])\n\n            #rewardpercep = 0\n            #if rewarddelay > -1:\n            #    rewarddelay -= 1\n            #if rewarddelay == 0:\n            #    # Now we can perceive the reward (and teleport)!\n            #    # NOTE: in this implementation, the agent only perceives the positive\n            #    # rewards - not the 'pain' of hitting the walls. That's OK (not\n            #    # something you *need* to learn within-life, outer loop can\n            #    # learn it)!\n            #    rewardpercep = params['rew']\n            #    if params['rsp'] == 1:\n            #        posr = np.random.randint(1, LABSIZE - 1)\n            #        posc = np.random.randint(1, LABSIZE - 1)\n            #        while lab[posr, posc] == 1 or (rposr == posr and rposc == posc):\n            #            posr = np.random.randint(1, LABSIZE - 1)\n            #            posc = np.random.randint(1, LABSIZE - 1)\n            #    else:\n            #        posr = CTR\n            #        posc = CTR\n\n            ## Explortion reward (actually a penalty on the normalized visit count of the new location)\n            #count[posr, posc] += 1\n            #reward -= (count[posr, posc] / np.sum(count)) * params['exprew']\n\n\n            if PRINTTRACE:\n                #print(\"Step \", numstep, \"- GI: \", goodinputs, \", GA: \", goodaction, \" Inputs: \", inputsN, \" - Outputs: \", y.data.cpu().numpy(), \" - action chosen: \", numactionchosen,\n                #        \" - inputsthisstep:\", inputsthisstep, \" - mean abs pw: \", np.mean(np.abs(pw.data.cpu().numpy())), \" -Rew: \", reward)\n                print(\"Step \", numstep, \" Inputs: \", inputs[0,:TOTALNBINPUTS], \" - Outputs: \", y.data.cpu().numpy(), \" - action chosen: \", numactionchosen,\n                        \" - mean abs pw: \", np.mean(np.abs(pw.data.cpu().numpy())), \" -Reward (this step): \", reward)\n            rewards.append(reward)\n            vs.append(v)\n            sumreward += reward\n\n\n\n            logprobs.append(distrib.log_prob(actionchosen))\n\n            #if params['algo'] == 'A3C':\n            loss += params['bent'] * y.pow(2).sum()   # We want to penalize concentration, i.e. encourage diversity; our version of PyTorch does not have an entropy() function for Distribution, so we use this instead.\n\n            ##if PRINTTRACE:\n            ##    print(\"Probabilities:\", y.data.cpu().numpy(), \"Picked action:\", numactionchosen, \", got reward\", reward)\n\n\n        # Episode is done, now let's do the actual computations\n        gammaR = params['gr']\n        if True: #params['algo'] == 'A3C':\n            R = 0\n            for numstepb in reversed(range(params['eplen'])) :\n                #BATCHSIZE = 1\n                #R = gammaR * R + rewards[numstepb]\n                #ctrR = R - vs[numstepb][0]\n                #lossv += ctrR.pow(2).sum() / BATCHSIZE\n                #loss -= (logprobs[numstepb] * ctrR.detach()).sum() / BATCHSIZE  # Need to check if detach() is OK\n                R = gammaR * R + rewards[numstepb]\n                lossv += (vs[numstepb][0] - R).pow(2)\n                loss -= logprobs[numstepb] * (R - vs[numstepb].data[0][0])  # Not sure if the \"data\" is needed... put it b/c of worry about weird gradient flows\n            loss += params['blossv'] * lossv\n\n        #elif params['algo'] == 'REI':\n        #    R = sumreward\n        #    baseline = meanrewards[rposr, rposc]\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        loss -= logprobs[numstepb] * (R - baseline)\n        #elif params['algo'] == 'REINOB':\n        #    R = sumreward\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        loss -= logprobs[numstepb] * R\n        #elif params['algo'] == 'REITMP':\n        #    R = 0\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        R = gammaR * R + rewards[numstepb]\n        #        loss -= logprobs[numstepb] * R\n        #elif params['algo'] == 'REITMPB':\n        #    R = 0\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        R = gammaR * R + rewards[numstepb]\n        #        loss -= logprobs[numstepb] * (R - meanrewardstmp[rposr, rposc, numstepb])\n\n        #else:\n        #    raise ValueError(\"Which algo?\")\n\n        #meanrewards[rposr, rposc] = (1.0 - params['nu']) * meanrewards[rposr, rposc] + params['nu'] * sumreward\n        #R = 0\n        #for numstepb in reversed(range(params['eplen'])) :\n        #    R = gammaR * R + rewards[numstepb]\n        #    meanrewardstmp[rposr, rposc, numstepb] = (1.0 - params['nu']) * meanrewardstmp[rposr, rposc, numstepb] + params['nu'] * R\n\n        loss /= params['eplen']\n\n        if PRINTTRACE:\n            if True: #params['algo'] == 'A3C':\n                print(\"lossv: \", lossv.data.cpu().numpy()[0])\n            print (\"Total reward for this episode:\", sumreward, \"Dist:\", dist)\n\n        #if params['squash'] == 1:\n        #    if sumreward < 0:\n        #        sumreward = -np.sqrt(-sumreward)\n        #    else:\n        #        sumreward = np.sqrt(sumreward)\n        #elif params['squash'] == 0:\n        #    pass\n        #else:\n        #    raise ValueError(\"Incorrect value for squash parameter\")\n\n        #loss *= sumreward\n        if numiter == 5212 : \n            pdb.set_trace()\n\n        #for p in net.parameters():\n        #    p.grad.data.clamp_(-params['clp'], params['clp'])\n        if numiter > 100:  # Burn-in period for meanrewards\n            loss.backward()\n            optimizer.step()\n\n        #torch.cuda.empty_cache()\n\n        #print(sumreward)\n        lossnum = loss.data[0]\n        lossbetweensaves += lossnum\n        all_losses_objective.append(lossnum)\n        all_total_rewards.append(sumreward)\n            #all_losses_v.append(lossv.data[0])\n        #total_loss  += lossnum\n\n\n        if (numiter+1) % params['print_every'] == 0:\n            \n            \n            np.savetxt('a2.txt', all_losses_objective)\n\n            print(numiter, \"====\")\n            print(\"Mean loss: \", lossbetweensaves / params['print_every'])\n            lossbetweensaves = 0\n            print(\"Mean reward: \", np.sum(all_total_rewards[-params['print_every']:])/ params['print_every'])\n            previoustime = nowtime\n            nowtime = time.time()\n            print(\"Time spent on last\", params['print_every'], \"iters: \", nowtime - previoustime)\n            if params['type'] == 'plastic' or params['type'] == 'lstmplastic':\n                print(\"ETA: \", net.eta.data.cpu().numpy(), \"alpha[0,1]: \", net.alpha.data.cpu().numpy()[0,1], \"w[0,1]: \", net.w.data.cpu().numpy()[0,1] )\n            elif params['type'] == 'modul':\n                print(\"ETA: \", net.eta.data.cpu().numpy(), \" etaet: \", net.etaet.data.cpu().numpy(), \" mean-abs pw: \", np.mean(np.abs(pw.data.cpu().numpy())))\n            elif params['type'] == 'rnn':\n                print(\"w[0,1]: \", net.w.data.cpu().numpy()[0,1] )\n\n        if (numiter+1) % params['save_every'] == 0:\n            print(\"Saving files...\")\n#            lossbetweensaves /= params['save_every']\n#            print(\"Average loss over the last\", params['save_every'], \"episodes:\", lossbetweensaves)\n#            print(\"Alternative computation (should be equal):\", np.mean(all_losses_objective[-params['save_every']:]))\n            losslast100 = np.mean(all_losses_objective[-100:])\n            print(\"Average loss over the last 100 episodes:\", losslast100)\n#            # Instability detection; necessary for SELUs, which seem to be divergence-prone\n#            # Note that if we are unlucky enough to have diverged within the last 100 timesteps, this may not save us.\n#            if losslast100 > 2 * lossbetweensavesprev:\n#                print(\"We have diverged ! Restoring last savepoint!\")\n#                net.load_state_dict(torch.load('./torchmodel_'+suffix + '.txt'))\n#            else:\n            print(\"Saving local files...\")\n            #with open('params_'+suffix+'.dat', 'wb') as fo:\n            #        #pickle.dump(net.w.data.cpu().numpy(), fo)\n            #        #pickle.dump(net.alpha.data.cpu().numpy(), fo)\n            #        #pickle.dump(net.eta.data.cpu().numpy(), fo)\n            #        #pickle.dump(all_losses, fo)\n            #        pickle.dump(params, fo)\n            #with open('loss_'+suffix+'.txt', 'w') as thefile:\n            #    for item in all_losses_objective:\n            #            thefile.write(\"%s\\n\" % item)\n            #with open('lossv_'+suffix+'.txt', 'w') as thefile:\n            #    for item in all_losses_v:\n            #            thefile.write(\"%s\\n\" % item)\n            with open('loss_'+suffix+'.txt', 'w') as thefile:\n                for item in all_total_rewards[::10]:\n                        thefile.write(\"%s\\n\" % item)\n            torch.save(net.state_dict(), 'torchmodel_'+suffix+'.dat')\n            with open('params_'+suffix+'.dat', 'wb') as fo:\n                pickle.dump(params, fo)\n            if os.path.isdir('/mnt/share/tmiconi'):\n                print(\"Transferring to NFS storage...\")\n                for fn in ['params_'+suffix+'.dat', 'loss_'+suffix+'.txt', 'torchmodel_'+suffix+'.dat']:\n                    result = os.system(\n                        'cp {} {}'.format(fn, '/mnt/share/tmiconi/modulmaze/'+fn))\n                print(\"Done!\")\n#            lossbetweensavesprev = lossbetweensaves\n#            lossbetweensaves = 0\n#            sys.stdout.flush()\n#            sys.stderr.flush()\n\n\n\nif __name__ == \"__main__\":\n#defaultParams = {\n#    'type' : 'lstm',\n#    'seqlen' : 200,\n#    'hs': 500,\n#    'activ': 'tanh',\n#    'steplr': 10e9,  # By default, no change in the learning rate\n#    'gamma': .5,  # The annealing factor of learning rate decay for Adam\n#    'imagesize': 31,\n#    'nbiter': 30000,\n#    'lr': 1e-4,\n#    'test_every': 10,\n#    'save_every': 3000,\n#    'rngseed':0\n#}\n\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"--rngseed\", type=int, help=\"random seed\", default=0)\n    #parser.add_argument(\"--clamp\", type=float, help=\"maximum (absolute value) gradient for clamping\", default=1000000.0)\n    #parser.add_argument(\"--wp\", type=float, help=\"wall penalty (reward decrement for hitting a wall)\", default=0.1)\n    parser.add_argument(\"--rew\", type=float, help=\"reward value (reward increment for taking correct action after correct stimulus)\", default=1.0)\n    parser.add_argument(\"--wp\", type=float, help=\"penalty for hitting walls\", default=.05)\n    #parser.add_argument(\"--pen\", type=float, help=\"penalty value (reward decrement for taking any non-rest action)\", default=.2)\n    #parser.add_argument(\"--exprew\", type=float, help=\"reward value (reward increment for hitting reward location)\", default=.0)\n    parser.add_argument(\"--bent\", type=float, help=\"coefficient for the entropy reward (really Simpson index concentration measure)\", default=0.03)\n    #parser.add_argument(\"--probarev\", type=float, help=\"probability of reversal (random change) in desired stimulus-response, per time step\", default=0.0)\n    parser.add_argument(\"--blossv\", type=float, help=\"coefficient for value prediction loss\", default=.1)\n    #parser.add_argument(\"--lsize\", type=int, help=\"size of the labyrinth; must be odd\", default=7)\n    #parser.add_argument(\"--randstart\", type=int, help=\"when hitting reward, should we teleport to random location (1) or center (0)?\", default=0)\n    #parser.add_argument(\"--rp\", type=int, help=\"whether the reward should be on the periphery\", default=0)\n    #parser.add_argument(\"--squash\", type=int, help=\"squash reward through signed sqrt (1 or 0)\", default=0)\n    #parser.add_argument(\"--nbarms\", type=int, help=\"number of arms\", default=2)\n    #parser.add_argument(\"--nbseq\", type=int, help=\"number of sequences between reinitializations of hidden/Hebbian state and position\", default=3)\n    #parser.add_argument(\"--activ\", help=\"activ function ('tanh' or 'selu')\", default='tanh')\n    #parser.add_argument(\"--algo\", help=\"meta-learning algorithm (A3C or REI)\", default='A3C')\n    parser.add_argument(\"--rule\", help=\"learning rule ('hebb' or 'oja')\", default='hebb')\n    parser.add_argument(\"--type\", help=\"network type ('lstm' or 'rnn' or 'plastic')\", default='modul')\n    parser.add_argument(\"--msize\", type=int, help=\"size of the maze; must be odd\", default=9)\n    parser.add_argument(\"--da\", help=\"transformation function of DA signal (tanh or sig or lin)\", default='tanh')\n    parser.add_argument(\"--gr\", type=float, help=\"gammaR: discounting factor for rewards\", default=.9)\n    parser.add_argument(\"--lr\", type=float, help=\"learning rate (Adam optimizer)\", default=1e-4)\n    parser.add_argument(\"--fm\", type=int, help=\"if using neuromodulation, do we modulate the whole network (1) or just half (0) ?\", default=1)\n    #parser.add_argument(\"--nu\", type=float, help=\"REINFORCE baseline time constant\", default=.1)\n    #parser.add_argument(\"--samestep\", type=int, help=\"compare stimulus and response in the same step (1) or from successive steps (0) ?\", default=0)\n    #parser.add_argument(\"--nbin\", type=int, help=\"number of possible inputs stimulis\", default=4)\n    #parser.add_argument(\"--modhalf\", type=int, help=\"which half of the recurrent netowkr receives modulation (1 or 2)\", default=1)\n    #parser.add_argument(\"--nbac\", type=int, help=\"number of possible non-rest actions\", default=4)\n    parser.add_argument(\"--rsp\", type=int, help=\"does the agent start each episode from random position (1) or center (0) ?\", default=1)\n    parser.add_argument(\"--addpw\", type=int, help=\"are plastic weights purely additive (1) or forgetting (0) ?\", default=1)\n    parser.add_argument(\"--clp\", type=int, help=\"inputs clamped (1), fully clamped (2) or through linear layer (0) ?\", default=0)\n    parser.add_argument(\"--md\", type=int, help=\"maximum delay for reward reception\", default=0)\n    parser.add_argument(\"--eplen\", type=int, help=\"length of episodes\", default=100)\n    #parser.add_argument(\"--exptime\", type=int, help=\"exploration (no reward) time (must be < eplen)\", default=0)\n    parser.add_argument(\"--hs\", type=int, help=\"size of the recurrent (hidden) layer\", default=100)\n    parser.add_argument(\"--l2\", type=float, help=\"coefficient of L2 norm (weight decay)\", default=3e-6)\n    #parser.add_argument(\"--steplr\", type=int, help=\"duration of each step in the learning rate annealing schedule\", default=100000000)\n    #parser.add_argument(\"--gamma\", type=float, help=\"learning rate annealing factor\", default=0.3)\n    parser.add_argument(\"--nbiter\", type=int, help=\"number of learning cycles\", default=1000000)\n    parser.add_argument(\"--save_every\", type=int, help=\"number of cycles between successive save points\", default=1000)\n    parser.add_argument(\"--print_every\", type=int, help=\"number of cycles between successive printing of information\", default=100)\n    #parser.add_argument(\"--\", type=int, help=\"\", default=1e-4)\n    args = parser.parse_args(); argvars = vars(args); argdict =  { k : argvars[k] for k in argvars if argvars[k] != None }\n    #train()\n    train(argdict)\n\n"
  },
  {
    "path": "omniglot/.ipynb_checkpoints/Omniglot Data Loading-checkpoint.ipynb",
    "content": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 6,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"Loading Omniglot data...\\n\",\n      \"['/home/tmiconi/exp/omniglot-master/python/images_background/Futurama', '/home/tmiconi/exp/omniglot-master/python/images_background/Japanese_(katakana)', '/home/tmiconi/exp/omniglot-master/python/images_background/Cyrillic', '/home/tmiconi/exp/omniglot-master/python/images_background/Tagalog']\\n\",\n      \"['/home/tmiconi/exp/omniglot-master/python/images_evaluation/Ge_ez', '/home/tmiconi/exp/omniglot-master/python/images_evaluation/Tibetan', '/home/tmiconi/exp/omniglot-master/python/images_evaluation/Tengwar', '/home/tmiconi/exp/omniglot-master/python/images_evaluation/Atemayar_Qelisayer']\\n\",\n      \"1623\\n\",\n      \"(105, 105)\\n\",\n      \"Data loaded!\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"# Loading Omniglot data\\n\",\n    \"import numpy as np\\n\",\n    \"import matplotlib.pyplot as plt\\n\",\n    \"import glob\\n\",\n    \"\\n\",\n    \"print(\\\"Loading Omniglot data...\\\")\\n\",\n    \"imagedata = []\\n\",\n    \"imagefilenames=[]\\n\",\n    \"for basedir in ('/home/tmiconi/exp/omniglot-master/python/images_background/', \\n\",\n    \"                '/home/tmiconi/exp/omniglot-master/python/images_evaluation/'):\\n\",\n    \"    alphabetdirs = glob.glob(basedir+'*')\\n\",\n    \"    print(alphabetdirs[:4])\\n\",\n    \"    for alphabetdir in alphabetdirs:\\n\",\n    \"        chardirs = glob.glob(alphabetdir+\\\"/*\\\")\\n\",\n    \"        for chardir in chardirs:\\n\",\n    \"            chardata = []\\n\",\n    \"            charfiles = glob.glob(chardir+'/*')\\n\",\n    \"            for fn in charfiles:\\n\",\n    \"                filedata = plt.imread(fn)\\n\",\n    \"                chardata.append(filedata)\\n\",\n    \"            imagedata.append(chardata)\\n\",\n    \"            imagefilenames.append(fn)\\n\",\n    \"# imagedata is now a list of lists of numpy arrays \\n\",\n    \"# imagedata[CharactertNumber][FileNumber] -> numpy(105,105)\\n\",\n    \"print(len(imagedata))\\n\",\n    \"print(imagedata[1][2].shape)\\n\",\n    \"print(\\\"Data loaded!\\\")\\n\",\n    \"\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 101,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stderr\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"/usr/local/lib/python3.5/dist-packages/skimage/transform/_warps.py:84: UserWarning: The default mode, 'constant', will be changed to 'reflect' in skimage 0.15.\\n\",\n      \"  warn(\\\"The default mode, 'constant', will be changed to 'reflect' in \\\"\\n\"\n     ]\n    },\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"Displayed.\\n\"\n     ]\n    },\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"dtype('float64')\"\n      ]\n     },\n     \"execution_count\": 101,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    },\n    {\n     \"data\": {\n      \"image/png\": \"iVBORw0KGgoAAAANSUhEUgAAAP8AAAD8CAYAAAC4nHJkAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAADFtJREFUeJzt3W2sHOV5h/Hr7gmkEskHMK7rOG5M\\nI4pEkHCqU7c0qEqVFxyKavIFxaoit6VxlAaptKkURFUVVVVFI5I0H1qkk2JhKkqoRBBWZZVQqyqK\\nGgEH5GAMTaDEEXb8hp02phJvJ3c/nHF0Aufsrndnd9bc109a7ew8szu3x/57Xp6dfSIzkVTPz3Rd\\ngKRuGH6pKMMvFWX4paIMv1SU4ZeKMvxSUYZfKsrwS0W9bZIru/CCmdyw/pxJrlIq5cALr/HiyYUY\\nZNmRwh8Rm4GvADPAP2Tmrb2W37D+HB59cP0oq5TUw6arXhh42aEP+yNiBvg74GPApcDWiLh02M+T\\nNFmjnPNvAp7LzOcz81Xga8CWdsqSNG6jhH8dsPQY42Az76dExPaImI+I+eMnFkZYnaQ2jf1qf2bO\\nZeZsZs6uXjUz7tVJGtAo4T8ELL169+5mnqSzwCjhfwy4OCIuiohzgU8Au9opS9K4Dd3Vl5mvR8QN\\nwIMsdvXtyMz9rVUmaaxG6ufPzN3A7pZqkTRBfr1XKsrwS0UZfqkowy8VZfilogy/VJThl4oy/FJR\\nhl8qyvBLRRl+qSjDLxVl+KWiDL9UlOGXijL8UlGGXyrK8EtFGX6pKMMvFWX4paIMv1SU4ZeKMvxS\\nUYZfKsrwS0UZfqkowy8VZfilokYapTciDgCngAXg9cycbaMoSeM3Uvgbv5mZL7bwOZImyMN+qahR\\nw5/ANyLi8YjY3kZBkiZj1MP+KzPzUET8HPBQRPxXZj68dIHmP4XtAL+wro2zDEltGGnPn5mHmudj\\nwP3ApmWWmcvM2cycXb1qZpTVSWrR0OGPiPMi4p2np4GPAk+1VZik8RrlOHwNcH9EnP6cf8rMf22l\\nKkljN3T4M/N54PIWa5E0QXb1SUUZfqkowy8VZfilogy/VJThl4qaqu/bXvWujV2XIA3kwR/s7bqE\\nkbnnl4oy/FJRhl8qyvBLRRl+qSjDLxVl+KWipqqf/8iNv97Zun/+b/+zZ3uXtVW17s79PdsX/ud/\\ne7Yf+eNx/p3Zzy/pLGX4paIMv1SU4ZeKMvxSUYZfKsrwS0VFZk5sZbOX/2w++uD6ia3vTPT7LYFp\\nvn/78i/84Ypt/b6/0OWfa9y/3zDNf2fjsumqF5j/9ssxyLLu+aWiDL9UlOGXijL8UlGGXyrK8EtF\\nGX6pqL7380fEDuAa4FhmXtbMuwC4F9gAHACuy8wfjq9M9dKvL79Lo/Tl7zr0WM/23173K0N/tgbb\\n898JbH7DvJuAPZl5MbCneS3pLNI3/Jn5MHDyDbO3ADub6Z3AtS3XJWnMhj3nX5OZh5vpI8CaluqR\\nNCEjX/DLxZsDVrxBICK2R8R8RMwfP7Ew6uoktWTY8B+NiLUAzfOxlRbMzLnMnM3M2dWrZoZcnaS2\\nDRv+XcC2Znob8EA75UialL7hj4h7gG8Bl0TEwYi4HrgV+EhEPAt8uHkt6SzSt58/M7eu0PShlmsp\\na+8rr/Rs//xFv9qz/Qd/uvLv07/rtvF+B2CUfvx+99tf/eHf6fMJ3x163fIbflJZhl8qyvBLRRl+\\nqSjDLxVl+KWipmqI7requ0+t6tl+1yW9f878xB9c0bN935/8/YptV93Wuyvuon/5VM/2X9re+7ba\\nfkb5+eyFp3t35b18zaY+n1Dvp7vPhHt+qSjDLxVl+KWiDL9UlOGXijL8UlGGXyrKfv4WjDrU9Cu/\\n1fsnqOf/8vaRPr+Xfv34n3n2uZ7t1573UpvlnJEXf+//Olv3W4F7fqkowy8VZfilogy/VJThl4oy\\n/FJRhl8qyn7+AW1+T697x1/t+d6/+d4jPds3vn167zvvsh9f4+WeXyrK8EtFGX6pKMMvFWX4paIM\\nv1SU4ZeK6tvPHxE7gGuAY5l5WTPvFuBTwPFmsZszc/e4ipwG+drKffn9f5v+7e0WcwYO3ve+nu37\\nr7h7QpVo2gyy578T2LzM/C9n5sbm8ZYOvvRW1Df8mfkwcHICtUiaoFHO+W+IiCcjYkdEnN9aRZIm\\nYtjw3w68F9gIHAa+uNKCEbE9IuYjYv74iYUhVyepbUOFPzOPZuZCZv4Y+Cqw4l0vmTmXmbOZObt6\\n1cywdUpq2VDhj4i1S15+HHiqnXIkTcogXX33AB8ELoyIg8BfAB+MiI1AAgeAT4+xRklj0Df8mbl1\\nmdl3jKGWqfbX33u0R+u5E6vjTNmPr5X4DT+pKMMvFWX4paIMv1SU4ZeKMvxSUf50d6P/bbnT251X\\nld2Yo3HPLxVl+KWiDL9UlOGXijL8UlGGXyrK8EtF2c+vs9Yld3ymZ/uGP//Wim0z77uk53t3P3Tv\\nUDWdTdzzS0UZfqkowy8VZfilogy/VJThl4oy/FJR9vPrrNWrHx/g5WtWHEiK/5iba7ucs457fqko\\nwy8VZfilogy/VJThl4oy/FJRhl8qqm8/f0SsB+4C1gAJzGXmVyLiAuBeYANwALguM384vlJVzYG/\\nuqJn+3d+//Y+n9BvLIbaBtnzvw58LjMvBX4N+GxEXArcBOzJzIuBPc1rSWeJvuHPzMOZ+UQzfQp4\\nBlgHbAF2NovtBK4dV5GS2ndG5/wRsQF4P/AIsCYzDzdNR1g8LZB0lhg4/BHxDuA+4MbM/NHStsxM\\nFq8HLPe+7RExHxHzx08sjFSspPYMFP6IOIfF4N+dmV9vZh+NiLVN+1rg2HLvzcy5zJzNzNnVq2ba\\nqFlSC/qGPyICuAN4JjO/tKRpF7Ctmd4GPNB+eZLGZZBbej8AfBLYFxGn+05uBm4F/jkirge+D1w3\\nnhJVVf+uPI2ib/gz85tArND8oXbLkTQpfsNPKsrwS0UZfqkowy8VZfilogy/VJThl4oy/FJRhl8q\\nyvBLRRl+qSjDLxVl+KWiDL9UlOGXijL8UlGGXyrK8EtFGX6pKMMvFWX4paIMv1SU4ZeKMvxSUYZf\\nKsrwS0UZfqkowy8VZfilogy/VFTf8EfE+oj494h4OiL2R8QfNfNviYhDEbG3eVw9/nIlteVtAyzz\\nOvC5zHwiIt4JPB4RDzVtX87M28ZXnqRx6Rv+zDwMHG6mT0XEM8C6cRcmabzO6Jw/IjYA7wceaWbd\\nEBFPRsSOiDh/hfdsj4j5iJg/fmJhpGIltWfg8EfEO4D7gBsz80fA7cB7gY0sHhl8cbn3ZeZcZs5m\\n5uzqVTMtlCypDQOFPyLOYTH4d2fm1wEy82hmLmTmj4GvApvGV6aktg1ytT+AO4BnMvNLS+avXbLY\\nx4Gn2i9P0rgMcrX/A8AngX0RsbeZdzOwNSI2AgkcAD49lgoljcUgV/u/CcQyTbvbL0fSpPgNP6ko\\nwy8VZfilogy/VJThl4oy/FJRhl8qyvBLRRl+qSjDLxVl+KWiDL9UlOGXijL8UlGRmZNbWcRx4PtL\\nZl0IvDixAs7MtNY2rXWBtQ2rzdrek5mrB1lwouF/08oj5jNztrMCepjW2qa1LrC2YXVVm4f9UlGG\\nXyqq6/DPdbz+Xqa1tmmtC6xtWJ3U1uk5v6TudL3nl9SRTsIfEZsj4jsR8VxE3NRFDSuJiAMRsa8Z\\neXi+41p2RMSxiHhqybwLIuKhiHi2eV52mLSOapuKkZt7jCzd6babthGvJ37YHxEzwHeBjwAHgceA\\nrZn59EQLWUFEHABmM7PzPuGI+A3gJeCuzLysmfcF4GRm3tr8x3l+Zn5+Smq7BXip65GbmwFl1i4d\\nWRq4FvhdOtx2Peq6jg62Wxd7/k3Ac5n5fGa+CnwN2NJBHVMvMx8GTr5h9hZgZzO9k8V/PBO3Qm1T\\nITMPZ+YTzfQp4PTI0p1uux51daKL8K8DXljy+iDTNeR3At+IiMcjYnvXxSxjTTNsOsARYE2XxSyj\\n78jNk/SGkaWnZtsNM+J127zg92ZXZuYvAx8DPtsc3k6lXDxnm6bumoFGbp6UZUaW/okut92wI163\\nrYvwHwLWL3n97mbeVMjMQ83zMeB+pm/04aOnB0ltno91XM9PTNPIzcuNLM0UbLtpGvG6i/A/Blwc\\nERdFxLnAJ4BdHdTxJhFxXnMhhog4D/go0zf68C5gWzO9DXigw1p+yrSM3LzSyNJ0vO2mbsTrzJz4\\nA7iaxSv+/w38WRc1rFDXLwLfbh77u64NuIfFw8DXWLw2cj2wCtgDPAv8G3DBFNX2j8A+4EkWg7a2\\no9quZPGQ/klgb/O4uutt16OuTrab3/CTivKCn1SU4ZeKMvxSUYZfKsrwS0UZfqkowy8VZfilov4f\\nNMLe08/YJvoAAAAASUVORK5CYII=\\n\",\n      \"text/plain\": [\n       \"<matplotlib.figure.Figure at 0x7f6224424b38>\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"data\": {\n      \"image/png\": \"iVBORw0KGgoAAAANSUhEUgAAAP8AAAD8CAYAAAC4nHJkAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAAC7ZJREFUeJzt3V2IXPUZx/Hfr2tUiBaSNY1pjI0V\\nKYhgLNO0VikWq4mhkEjBmgtJQYz4AlqkVWxp7V1qq7YXrRA1mIrVFqyYi9CYBkFEsa4SNdG2URsx\\nMWZjcqHSFs369GJPZNWdM+PMOXMmeb4fWHb2nHl5GPLNvJzZ/TsiBCCfzzU9AIBmED+QFPEDSRE/\\nkBTxA0kRP5AU8QNJET+QFPEDSR01yBs7YfZILFwwY5A3CaSy840P9PaBCXdz3r7it71U0m8ljUi6\\nOyLWlJ1/4YIZ+vumBf3cJIASi5e80fV5e37ab3tE0u8kXSTpdEkrbZ/e6/UBGKx+XvMvlvRKRLwW\\nEe9LelDS8mrGAlC3fuKfL2nqc4xdxbaPsb3a9pjtsX37J/q4OQBVqv3d/ohYGxGtiGjNGR2p++YA\\ndKmf+HdLmvru3UnFNgCHgX7if0bSabZPsX20pEslbahmLAB16/lQX0QctH2tpE2aPNS3LiK2VzYZ\\ngFr1dZw/IjZK2ljRLAAGiI/3AkkRP5AU8QNJET+QFPEDSRE/kBTxA0kRP5AU8QNJET+QFPEDSRE/\\nkBTxA0kRP5AU8QNJET+QFPEDSRE/kBTxA0kRP5AU8QNJET+QFPEDSRE/kBTxA0kRP5AU8QNJET+Q\\nFPEDSfW1Sq/tnZLelTQh6WBEtKoYCkD9+oq/8O2IeLuC6wEwQDztB5LqN/6Q9KjtZ22vrmIgAIPR\\n79P+cyNit+0vSNps+x8R8fjUMxT/KayWpJPnV/EqA0AV+nrkj4jdxfdxSQ9LWjzNedZGRCsiWnNG\\nR/q5OQAV6jl+2zNtH3/otKQLJW2rajAA9ernefhcSQ/bPnQ9f4yIv1YyFYDa9Rx/RLwm6cwKZwEw\\nQBzqA5IifiAp4geSIn4gKeIHkiJ+ICniB5IifiAp4geSIn4gKeIHkiJ+ICniB5IifiAp/q7WEeDM\\nW69uu+/E3zxZetlNb26tehwcJnjkB5IifiAp4geSIn4gKeIHkiJ+ICniB5IifiAp4geSIn4gKeIH\\nkiJ+ICniB5IifiAp4geS6hi/7XW2x21vm7Jttu3NtncU32fVOyZKueQLaKObR/57JS39xLabJG2J\\niNMkbSl+BnAY6Rh/RDwu6cAnNi+XtL44vV7SiornAlCzXl/zz42IPcXptyTNrWgeAAPS9xt+ERGS\\not1+26ttj9ke27d/ot+bA1CRXuPfa3ueJBXfx9udMSLWRkQrIlpzRkd6vDkAVes1/g2SVhWnV0l6\\npJpxAAxKN4f6HpD0lKSv2N5l+3JJayRdYHuHpO8UPwM4jHT8u/0RsbLNrvMrngU9ev5Hv2+7b8kd\\niwY4CQ4nfMIPSIr4gaSIH0iK+IGkiB9IiviBpFiiG0Pr7Oe/V7r/nf8cW7p/+9n3VznOEYdHfiAp\\n4geSIn4gKeIHkiJ+ICniB5IifiApjvOjL62fXVW6f/Tup3q+7s/r1Q77yy1R+19n3vTm1h4mOrLw\\nyA8kRfxAUsQPJEX8QFLEDyRF/EBSxA8kxXF+lFp6cqt0/+jB8uP4+684u+2+n/74vtLLrpj5Xun+\\nTpZ8sf1x/mUXfL/0shs3/6mv2z4c8MgPJEX8QFLEDyRF/EBSxA8kRfxAUsQPJNXxOL/tdZK+K2k8\\nIs4ott0i6QpJ+4qz3RwRG+saEvUpOxY+6WBf1z/2izv7ujzq080j/72Slk6z/Y6IWFR8ET5wmOkY\\nf0Q8LunAAGYBMED9vOa/1vYLttfZnlXZRAAGotf475R0qqRFkvZIuq3dGW2vtj1me2zf/okebw5A\\n1XqKPyL2RsRERHwo6S5Ji0vOuzYiWhHRmjM60uucACrWU/y250358WJJ26oZB8CgdHOo7wFJ50k6\\nwfYuST+XdJ7tRZJC0k5JV9Y4I4AadIw/IlZOs/meGmZBj8781dVt952oJ/u67k5/377z5wTqc+Wu\\n9n8rYNJ/2+7J8Pv6nfAJPyAp4geSIn4gKeIHkiJ+ICniB5LiT3cfAU68o/fDeXUvVV12KPDUZ44t\\nveyrX/tfh2tvfyhPkt764TdL9rJEN4/8QFLEDyRF/EBSxA8kRfxAUsQPJEX8QFIc5z/ClR/rlvo9\\n3v3Lfz9duv/GU77edl/n4/jlrtrxSun+FTM5ll+GR34gKeIHkiJ+ICniB5IifiAp4geSIn4gKY7z\\nHwHKfye/3mPdi445pnR/3X8vAL3jkR9IiviBpIgfSIr4gaSIH0iK+IGkiB9IqmP8thfYfsz2S7a3\\n276u2D7b9mbbO4rvs+ofF0BVunnkPyjphog4XdI3JF1j+3RJN0naEhGnSdpS/AzgMNEx/ojYExHP\\nFafflfSypPmSlktaX5xtvaQVdQ0JoHqf6TW/7YWSzpL0tKS5EbGn2PWWpLmVTgagVl3Hb/s4SQ9J\\nuj4i3pm6LyJCUrS53GrbY7bH9u2f6GtYANXpKn7bMzQZ/v0R8Zdi817b84r98ySNT3fZiFgbEa2I\\naM0ZHaliZgAV6Obdfku6R9LLEXH7lF0bJK0qTq+S9Ej14wGoSze/0nuOpMskvWj70O9n3ixpjaQ/\\n275c0uuSLqlnRAB16Bh/RDwhyW12n1/tOAAGhU/4AUkRP5AU8QNJET+QFPEDSRE/kBTxA0kRP5AU\\n8QNJET+QFPEDSRE/kBTxA0kRP5AU8QNJET+QFPEDSRE/kBTxA0kRP5AU8QNJET+QFPEDSRE/kBTx\\nA0kRP5AU8QNJET+QFPEDSRE/kFTH+G0vsP2Y7Zdsb7d9XbH9Ftu7bW8tvpbVPy6AqhzVxXkOSroh\\nIp6zfbykZ21vLvbdERG/rm88AHXpGH9E7JG0pzj9ru2XJc2vezAA9fpMr/ltL5R0lqSni03X2n7B\\n9jrbs9pcZrXtMdtj+/ZP9DUsgOp0Hb/t4yQ9JOn6iHhH0p2STpW0SJPPDG6b7nIRsTYiWhHRmjM6\\nUsHIAKrQVfy2Z2gy/Psj4i+SFBF7I2IiIj6UdJekxfWNCaBq3bzbb0n3SHo5Im6fsn3elLNdLGlb\\n9eMBqEs37/afI+kySS/a3lpsu1nSStuLJIWknZKurGVCALXo5t3+JyR5ml0bqx8HwKDwCT8gKeIH\\nkiJ+ICniB5IifiAp4geSIn4gKeIHkiJ+ICniB5IifiAp4geSIn4gKeIHknJEDO7G7H2SXp+y6QRJ\\nbw9sgM9mWGcb1rkkZutVlbN9KSLmdHPGgcb/qRu3xyKi1dgAJYZ1tmGdS2K2XjU1G0/7gaSIH0iq\\n6fjXNnz7ZYZ1tmGdS2K2XjUyW6Ov+QE0p+lHfgANaSR+20tt/9P2K7ZvamKGdmzvtP1isfLwWMOz\\nrLM9bnvblG2zbW+2vaP4Pu0yaQ3NNhQrN5esLN3ofTdsK14P/Gm/7RFJ/5J0gaRdkp6RtDIiXhro\\nIG3Y3impFRGNHxO2/S1J70n6Q0ScUWy7VdKBiFhT/Mc5KyJuHJLZbpH0XtMrNxcLysyburK0pBWS\\nfqAG77uSuS5RA/dbE4/8iyW9EhGvRcT7kh6UtLyBOYZeRDwu6cAnNi+XtL44vV6T/3gGrs1sQyEi\\n9kTEc8XpdyUdWlm60fuuZK5GNBH/fElvTPl5l4Zrye+Q9KjtZ22vbnqYacwtlk2XpLckzW1ymGl0\\nXLl5kD6xsvTQ3He9rHhdNd7w+7RzI+Krki6SdE3x9HYoxeRrtmE6XNPVys2DMs3K0h9p8r7rdcXr\\nqjUR/25JC6b8fFKxbShExO7i+7ikhzV8qw/vPbRIavF9vOF5PjJMKzdPt7K0huC+G6YVr5uI/xlJ\\np9k+xfbRki6VtKGBOT7F9szijRjZninpQg3f6sMbJK0qTq+S9EiDs3zMsKzc3G5laTV83w3ditcR\\nMfAvScs0+Y7/q5J+0sQMbeb6sqTni6/tTc8m6QFNPg38QJPvjVwuaVTSFkk7JP1N0uwhmu0+SS9K\\nekGToc1raLZzNfmU/gVJW4uvZU3fdyVzNXK/8Qk/ICne8AOSIn4gKeIHkiJ+ICniB5IifiAp4geS\\nIn4gqf8DKi6h9fAP6nwAAAAASUVORK5CYII=\\n\",\n      \"text/plain\": [\n       \"<matplotlib.figure.Figure at 0x7f62243551d0>\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"data\": {\n      \"image/png\": \"iVBORw0KGgoAAAANSUhEUgAAAP8AAAD8CAYAAAC4nHJkAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAADJpJREFUeJzt3X+sX3V9x/Hne9fiEvAPWrq2lro6\\n1i1pSFq3m2ZTsrE4LaJJMVuITWa6rFmNSjaMwRE2N+L2B8Ep+sckqdJYF4eYIKORZpU1WwiBCBdC\\n+SFzZaQOan/Rmgj7B6jv/XFPzZXee76X76/zbd/PR/LNPd/zOed73vnCq+d8z+ec84nMRFI9v9R1\\nAZK6Yfilogy/VJThl4oy/FJRhl8qyvBLRRl+qSjDLxX1lnFu7JKlU7l2zZJxblIq5dALr/HSqdOx\\nmGUHCn9EXAV8GZgCvpaZt7Qtv3bNEh7Zt2aQTUpqsWnzC4tetu/D/oiYAv4J+ACwHtgaEev7/TxJ\\n4zXIb/5NwHOZ+Xxmvgp8C9gynLIkjdog4V8NzD3GeLGZ9wsiYkdEzETEzImTpwfYnKRhGvnZ/szc\\nmZnTmTm9fNnUqDcnaZEGCf9hYO7Zu0ubeZLOAYOE/1FgXUS8MyIuAD4C7BlOWZJGre+uvsx8PSKu\\nA/Yx29W3KzOfGVplkkZqoH7+zNwL7B1SLZLGyMt7paIMv1SU4ZeKMvxSUYZfKsrwS0UZfqkowy8V\\nZfilogy/VJThl4oy/FJRhl8qaqyP7tb4/f6OHa3tv/zdR8ZUydmOfurdre0HbvjKmCqpyT2/VJTh\\nl4oy/FJRhl8qyvBLRRl+qSjDLxVlP/95YMOtn1iwbeV3H2pd9+j17X3to7TytvbaNt+2caDP3/fj\\nJwZa/3znnl8qyvBLRRl+qSjDLxVl+KWiDL9UlOGXihqonz8iDgEvA6eB1zNzehhFaXwOfKbDe+Y/\\nM9jqm98+2HUA1Q3jIp8/yMyXhvA5ksbIw36pqEHDn8D3IuKxiGh/XpSkiTLoYf8VmXk4In4FuD8i\\n/iszH5i7QPOPwg6Ad6z2VgJpUgy058/Mw83f48A9wKZ5ltmZmdOZOb182dQgm5M0RH2HPyIujIi3\\nnZkG3g88PazCJI3WIMfhK4B7IuLM5/xLZv7bUKqSNHJ9hz8znwc2DLEW9amtr37zl87fvnDv1x+M\\nXX1SUYZfKsrwS0UZfqkowy8VZfilorze9jzQ+uhu2h+P3aVBb8m1q28w7vmlogy/VJThl4oy/FJR\\nhl8qyvBLRRl+qSj7+c8Dg9zS+8FNH2xtv++R+/qqaTEGHx7cfv5BuOeXijL8UlGGXyrK8EtFGX6p\\nKMMvFWX4paLs5z/P9brnffPb29fvdc/9IPfUj3p48H/9v4sWbLvmwldGuu1zgXt+qSjDLxVl+KWi\\nDL9UlOGXijL8UlGGXyqqZz9/ROwCPgQcz8zLm3lLgbuAtcAh4NrM/MnoytSofPzgc63tt6/79db2\\nLp8H0Etb7df4zP9F7fm/Dlz1hnk3Avszcx2wv3kv6RzSM/yZ+QBw6g2ztwC7m+ndwDVDrkvSiPX7\\nm39FZh5ppo8CK4ZUj6QxGfiEX2YmkAu1R8SOiJiJiJkTJ08PujlJQ9Jv+I9FxCqA5u/xhRbMzJ2Z\\nOZ2Z08uXTfW5OUnD1m/49wDbmultwL3DKUfSuPQMf0TcCTwM/GZEvBgR24FbgPdFxEHgD5v3ks4h\\nPfv5M3PrAk3vHXIt6kCv+9p79Yf3eh7AKG249ROt7St5aEyVnJu8wk8qyvBLRRl+qSjDLxVl+KWi\\nDL9UlI/u1kht+PzC3XEHbhjs0d0XH3xtoPWrc88vFWX4paIMv1SU4ZeKMvxSUYZfKsrwS0XZz6+B\\nHPr7321tX/vZlttqbxhs22+979HW9kP/0Fabj+52zy8VZfilogy/VJThl4oy/FJRhl8qyvBLRdnP\\nr4H8cPvtre2bP7txTJWcbe3fPLxw45+Nr45J5Z5fKsrwS0UZfqkowy8VZfilogy/VJThl4rq2c8f\\nEbuADwHHM/PyZt7NwJ8DJ5rFbsrMvaMqUpOry2Gy9/UcPry7awzOBYvZ838duGqe+bdl5sbmZfCl\\nc0zP8GfmA8CpMdQiaYwG+c1/XUQ8GRG7IuLioVUkaSz6Df/twGXARuAI8IWFFoyIHRExExEzJ06e\\n7nNzkoatr/Bn5rHMPJ2ZPwO+CmxqWXZnZk5n5vTyZVP91ilpyPoKf0SsmvP2w8DTwylH0rgspqvv\\nTuBK4JKIeBH4O+DKiNgIJHAI+NgIa5Q0Aj3Dn5lb55l9xwhqkTRGXuEnFWX4paIMv1SU4ZeKMvxS\\nUYZfKspHd2sgK7/Ufsvu0U+9u6W1u2Gyr77yj1rb9/7n3WOqpDvu+aWiDL9UlOGXijL8UlGGXyrK\\n8EtFGX6pKPv5NVIHbvhKZ9u+438fXLBt+zvGWMiEcs8vFWX4paIMv1SU4ZeKMvxSUYZfKsrwS0XZ\\nz19czyG2e9yvP8kufctFfa87/bcfb22f+dztfX/2pHDPLxVl+KWiDL9UlOGXijL8UlGGXyrK8EtF\\n9eznj4g1wDeAFUACOzPzyxGxFLgLWAscAq7NzJ+MrtTz1/qH/qS1fc0fPz2yba9ksH78fT/u7tn7\\no7Tsaw+3L/C58dQxSovZ878OfDoz1wO/A3wyItYDNwL7M3MdsL95L+kc0TP8mXkkMx9vpl8GngVW\\nA1uA3c1iu4FrRlWkpOF7U7/5I2It8C7g+8CKzDzSNB1l9meBpHPEosMfERcBdwPXZ+ZP57ZlZjJ7\\nPmC+9XZExExEzJw4eXqgYiUNz6LCHxFLmA3+NzPzO83sYxGxqmlfBRyfb93M3JmZ05k5vXzZ1DBq\\nljQEPcMfEQHcATybmV+c07QH2NZMbwPuHX55kkZlMbf0vgf4KPBURJzp17kJuAX4dkRsB34EXDua\\nEiffhs/3uC32tvbutDW0d+VN/cZl7QUsWfg/497772pft6jztYvyzegZ/sx8EIgFmt873HIkjYtX\\n+ElFGX6pKMMvFWX4paIMv1SU4ZeK8tHdQ9BrGOrH/uLV1vbffusFPbZgn7SGzz2/VJThl4oy/FJR\\nhl8qyvBLRRl+qSjDLxVlP/8Y9O7Hl8bPPb9UlOGXijL8UlGGXyrK8EtFGX6pKMMvFWX4paIMv1SU\\n4ZeKMvxSUYZfKsrwS0UZfqkowy8V1TP8EbEmIv4jIn4QEc9ExF8282+OiMMR8UTzunr05UoalsU8\\nzON14NOZ+XhEvA14LCLub9puy8x/HF15kkalZ/gz8whwpJl+OSKeBVaPujBJo/WmfvNHxFrgXcD3\\nm1nXRcSTEbErIi5eYJ0dETETETMnTp4eqFhJw7Po8EfERcDdwPWZ+VPgduAyYCOzRwZfmG+9zNyZ\\nmdOZOb182dQQSpY0DIsKf0QsYTb438zM7wBk5rHMPJ2ZPwO+CmwaXZmShm0xZ/sDuAN4NjO/OGf+\\nqjmLfRh4evjlSRqVxZztfw/wUeCpiDgzVvRNwNaI2AgkcAj42EgqlDQSiznb/yAQ8zTtHX45ksbF\\nK/ykogy/VJThl4oy/FJRhl8qyvBLRRl+qSjDLxVl+KWiDL9UlOGXijL8UlGGXyrK8EtFRWaOb2MR\\nJ4AfzZl1CfDS2Ap4cya1tkmtC6ytX8Os7Vczc/liFhxr+M/aeMRMZk53VkCLSa1tUusCa+tXV7V5\\n2C8VZfiloroO/86Ot99mUmub1LrA2vrVSW2d/uaX1J2u9/ySOtJJ+CPiqoj4YUQ8FxE3dlHDQiLi\\nUEQ81Yw8PNNxLbsi4nhEPD1n3tKIuD8iDjZ/5x0mraPaJmLk5paRpTv97iZtxOuxH/ZHxBTw38D7\\ngBeBR4GtmfmDsRaygIg4BExnZud9whHxe8ArwDcy8/Jm3q3Aqcy8pfmH8+LM/KsJqe1m4JWuR25u\\nBpRZNXdkaeAa4E/p8LtrqetaOvjeutjzbwKey8znM/NV4FvAlg7qmHiZ+QBw6g2ztwC7m+ndzP7P\\nM3YL1DYRMvNIZj7eTL8MnBlZutPvrqWuTnQR/tXAC3Pev8hkDfmdwPci4rGI2NF1MfNY0QybDnAU\\nWNFlMfPoOXLzOL1hZOmJ+e76GfF62Dzhd7YrMvO3gA8An2wObydSzv5mm6TumkWN3Dwu84ws/XNd\\nfnf9jng9bF2E/zCwZs77S5t5EyEzDzd/jwP3MHmjDx87M0hq8/d4x/X83CSN3DzfyNJMwHc3SSNe\\ndxH+R4F1EfHOiLgA+Aiwp4M6zhIRFzYnYoiIC4H3M3mjD+8BtjXT24B7O6zlF0zKyM0LjSxNx9/d\\nxI14nZljfwFXM3vG/3+Av+6ihgXq+jXgQPN6puvagDuZPQx8jdlzI9uBZcB+4CDw78DSCartn4Gn\\ngCeZDdqqjmq7gtlD+ieBJ5rX1V1/dy11dfK9eYWfVJQn/KSiDL9UlOGXijL8UlGGXyrK8EtFGX6p\\nKMMvFfX/gxbi67QgFM4AAAAASUVORK5CYII=\\n\",\n      \"text/plain\": [\n       \"<matplotlib.figure.Figure at 0x7f6224379ba8>\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"data\": {\n      \"image/png\": \"iVBORw0KGgoAAAANSUhEUgAAAP8AAAD8CAYAAAC4nHJkAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAAC8ZJREFUeJzt3V2MXHd5gPHn7RK4CAjZjnGNcXFA\\nKZKVqqZa3KqJKhBfIYrkcJNiJOpKUY2ASKRCgihckKsqqiBpLwrSurEwFQ1FIlF8YRGCRZtSUJpN\\nZJwvSEJqFDuO7diRSIQQZHl7scdonezOjGfOzBnv+/yk1c6eM7vzZpTHZ2bO7P4jM5FUzx90PYCk\\nbhi/VJTxS0UZv1SU8UtFGb9UlPFLRRm/VJTxS0W9bpI3dsnamdyy+aJJ3qRUypFnf8sLZxZikOuO\\nFH9EXAX8MzAD/Gtm3trr+ls2X8T/3rt5lJuU1MP2Dz878HWHftgfETPAvwAfAbYCOyNi67A/T9Jk\\njfKcfzvwdGY+k5m/Ab4F7GhnLEnjNkr8m4CljzGONtvOERG7I2I+IuZPnV4Y4eYktWnsr/Zn5lxm\\nzmbm7Pp1M+O+OUkDGiX+Y8DSV+/e1myTdAEYJf4Hgcsi4tKIeD3wMWB/O2NJGrehT/Vl5isRcQNw\\nL4un+vZm5mOtTSZprEY6z5+ZB4ADLc0iaYJ8e69UlPFLRRm/VJTxS0UZv1SU8UtFGb9UlPFLRRm/\\nVJTxS0UZv1SU8UtFGb9UlPFLRRm/VJTxS0UZv1SU8UtFGb9UlPFLRRm/VJTxS0UZv1SU8UtFGb9U\\nlPFLRRm/VJTxS0UZv1TUSKv0RsQR4CVgAXglM2fbGErS+I0Uf+N9mflCCz9H0gT5sF8qatT4E/he\\nRDwUEbvbGEjSZIz6sP/KzDwWEW8B7ouIn2bm/Uuv0PyjsBvgjza18SxDUhtGOvJn5rHm80ngbmD7\\nMteZy8zZzJxdv25mlJuT1KKh44+IiyPiTWcvAx8CHm1rMEnjNcrj8A3A3RFx9uf8e2Z+t5WpJI3d\\n0PFn5jPAn7Y4i6QJ8lSfVJTxS0UZv1SU8UtFGb9UlPFLRfl+21XgXf/9Nyvu2/LXhyc4yXS597lD\\nY/vZH37rtp77n7/xL3vu/8nnv9rmOEPxyC8VZfxSUcYvFWX8UlHGLxVl/FJRxi8V5Xn+VaDXufwL\\n4XzzsPqda/+T2z694r63fvlHbY9zjj/8pz4///NjvfmBeOSXijJ+qSjjl4oyfqko45eKMn6pKOOX\\nivI8/yp3IZ/HH1Wvc/m/vuY1i0ud47/m5nru7/v7/H/f+/0VML6/NTAoj/xSUcYvFWX8UlHGLxVl\\n/FJRxi8VZfxSUX3P80fEXuAa4GRmXt5sWwv8B7AFOAJcl5kvjm9MVXT1B67rc40ne+7t/Xf7uz/P\\n3rVBjvxfB6561babgIOZeRlwsPla0gWkb/yZeT9w5lWbdwD7msv7gGtbnkvSmA37nH9DZh5vLj8P\\nbGhpHkkTMvILfpmZQK60PyJ2R8R8RMyfOr0w6s1Jasmw8Z+IiI0AzeeTK10xM+cyczYzZ9evmxny\\n5iS1bdj49wO7msu7gHvaGUfSpPSNPyLuBH4MvCsijkbE9cCtwAcj4ingA83Xki4gfc/zZ+bOFXa9\\nv+VZpHMc+P63e+7v9zv1oxj1Z1/x8YdbmmR8fIefVJTxS0UZv1SU8UtFGb9UlPFLRfmnu1e5cZ4O\\nm3a9/ttf/u47en7v/zx3V9vjTB2P/FJRxi8VZfxSUcYvFWX8UlHGLxVl/FJRnudf5Wa2/nHP/b+6\\n9M099//nnj1tjtOqfu9h8E939+aRXyrK+KWijF8qyvilooxfKsr4paKMXyrK8/yrXL8/f626PPJL\\nRRm/VJTxS0UZv1SU8UtFGb9UlPFLRfU9zx8Re4FrgJOZeXmz7Rbg74BTzdVuzswD4xpytZv90qd6\\n7l+358cTmkSVDHLk/zpw1TLbb8/Mbc2H4UsXmL7xZ+b9wJkJzCJpgkZ5zn9DRByOiL0Rsaa1iSRN\\nxLDxfw14J7ANOA58ZaUrRsTuiJiPiPlTpxeGvDlJbRsq/sw8kZkLmfk7YA+wvcd15zJzNjNn16+b\\nGXZOSS0bKv6I2Ljky48Cj7YzjqRJGeRU353Ae4FLIuIo8CXgvRGxDUjgCPDJMc4oaQz6xp+ZO5fZ\\nfMcYZlm1+v19+bes+WnP/f/wfw/03L/tDW8475kk3+EnFWX8UlHGLxVl/FJRxi8VZfxSUf7p7imw\\n8OKLPfd/4dI/n9AkqsQjv1SU8UtFGb9UlPFLRRm/VJTxS0UZv1SU5/k1kifn3tNz/+3vu3NCk+h8\\neeSXijJ+qSjjl4oyfqko45eKMn6pKOOXivI8/wTc+9yhrkcYo9X837a6eeSXijJ+qSjjl4oyfqko\\n45eKMn6pKOOXiuobf0RsjogfRMTjEfFYRHy22b42Iu6LiKeaz2vGP66ktgxy5H8F+FxmbgX+AvhM\\nRGwFbgIOZuZlwMHma0kXiL7xZ+bxzHy4ufwS8ASwCdgB7Guutg+4dlxDSmrfeT3nj4gtwLuBB4AN\\nmXm82fU8sKHVySSN1cDxR8Qbge8AN2bmL5fuy8wEcoXv2x0R8xExf+r0wkjDSmrPQPFHxEUshv/N\\nzLyr2XwiIjY2+zcCJ5f73sycy8zZzJxdv26mjZkltWCQV/sDuAN4IjNvW7JrP7CrubwLuKf98SSN\\nyyC/0nsF8AngkYg4+/ubNwO3At+OiOuBXwDXjWdESePQN/7M/CEQK+x+f7vjSJoU3+EnFWX8UlHG\\nLxVl/FJRxi8VZfxSUcYvFWX8UlHGLxVl/FJRxi8VZfxSUcYvFWX8UlHGLxVl/FJRxi8VZfxSUcYv\\nFWX8UlHGLxVl/FJRxi8VZfxSUcYvFWX8UlHGLxVl/FJRxi8VZfxSUX3jj4jNEfGDiHg8Ih6LiM82\\n22+JiGMRcaj5uHr840pqy+sGuM4rwOcy8+GIeBPwUETc1+y7PTO/PL7xJI1L3/gz8zhwvLn8UkQ8\\nAWwa92CSxuu8nvNHxBbg3cADzaYbIuJwROyNiDUrfM/uiJiPiPlTpxdGGlZSewaOPyLeCHwHuDEz\\nfwl8DXgnsI3FRwZfWe77MnMuM2czc3b9upkWRpbUhoHij4iLWAz/m5l5F0BmnsjMhcz8HbAH2D6+\\nMSW1bZBX+wO4A3giM29bsn3jkqt9FHi0/fEkjcsgr/ZfAXwCeCQiDjXbbgZ2RsQ2IIEjwCfHMqGk\\nsRjk1f4fArHMrgPtjyNpUnyHn1SU8UtFGb9UlPFLRRm/VJTxS0UZv1SU8UtFGb9UlPFLRRm/VJTx\\nS0UZv1SU8UtFRWZO7sYiTgG/WLLpEuCFiQ1wfqZ1tmmdC5xtWG3O9vbMXD/IFSca/2tuPGI+M2c7\\nG6CHaZ1tWucCZxtWV7P5sF8qyvilorqOf67j2+9lWmeb1rnA2YbVyWydPueX1J2uj/ySOtJJ/BFx\\nVUT8LCKejoibuphhJRFxJCIeaVYenu94lr0RcTIiHl2ybW1E3BcRTzWfl10mraPZpmLl5h4rS3d6\\n303bitcTf9gfETPAk8AHgaPAg8DOzHx8ooOsICKOALOZ2fk54Yj4K+Bl4BuZeXmz7R+BM5l5a/MP\\n55rM/MKUzHYL8HLXKzc3C8psXLqyNHAt8Ld0eN/1mOs6OrjfujjybweezsxnMvM3wLeAHR3MMfUy\\n837gzKs27wD2NZf3sfg/z8StMNtUyMzjmflwc/kl4OzK0p3edz3m6kQX8W8Cnl3y9VGma8nvBL4X\\nEQ9FxO6uh1nGhmbZdIDngQ1dDrOMvis3T9KrVpaemvtumBWv2+YLfq91ZWb+GfAR4DPNw9uplIvP\\n2abpdM1AKzdPyjIrS/9el/fdsCtet62L+I8Bm5d8/bZm21TIzGPN55PA3Uzf6sMnzi6S2nw+2fE8\\nvzdNKzcvt7I0U3DfTdOK113E/yBwWURcGhGvBz4G7O9gjteIiIubF2KIiIuBDzF9qw/vB3Y1l3cB\\n93Q4yzmmZeXmlVaWpuP7bupWvM7MiX8AV7P4iv/PgS92McMKc70D+Enz8VjXswF3svgw8LcsvjZy\\nPbAOOAg8BXwfWDtFs/0b8AhwmMXQNnY025UsPqQ/DBxqPq7u+r7rMVcn95vv8JOK8gU/qSjjl4oy\\nfqko45eKMn6pKOOXijJ+qSjjl4r6f2vcpxkDpnnOAAAAAElFTkSuQmCC\\n\",\n      \"text/plain\": [\n       \"<matplotlib.figure.Figure at 0x7f6224322550>\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"data\": {\n      \"image/png\": \"iVBORw0KGgoAAAANSUhEUgAAAP8AAAD8CAYAAAC4nHJkAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAAC8BJREFUeJzt3V2oHPUdxvHn6WmkEL1IYhpijMba\\nIMSWxnIMtoZisWpMC4kUxFAkhdBIVagitGIvKvQmtFXxogaSGozFt4KKoaSNaWgJUlGPEvNiahMl\\n1sS8mRRMLqzm+OvFmcgxOfvi7uzOnPy+H1h2dv6zOz+G85x5+c/u3xEhAPl8oeoCAFSD8ANJEX4g\\nKcIPJEX4gaQIP5AU4QeSIvxAUoQfSOqL/VzZuZMHYtbMCf1cJZDKnnc/1vtHh93Osl2F3/YCSQ9K\\nGpD0h4hY0Wz5WTMn6OUNM7tZJYAm5l33btvLdnzYb3tA0u8lXS9pjqQltud0+nkA+qubc/55knZH\\nxNsR8ZGkJyUtKqcsAL3WTfhnSBp9jLG3mPcZtpfbHrI9dPjIcBerA1Cmnl/tj4hVETEYEYNTpwz0\\nenUA2tRN+PdJGn317vxiHoBxoJvwvyJptu2LbJ8l6SZJ68opC0CvddzVFxEnbN8uaYNGuvrWRMSO\\n0ioD0FNd9fNHxHpJ60uqBUAfcXsvkBThB5Ii/EBShB9IivADSRF+ICnCDyRF+IGkCD+QFOEHkiL8\\nQFKEH0iK8ANJEX4gKcIPJEX4gaQIP5AU4QeSIvxAUoQfSIrwA0n1dYjuM9Wt+65o2v7W5R929fkb\\n3tvS1fuBsbDnB5Ii/EBShB9IivADSRF+ICnCDyRF+IGkuurnt71H0jFJw5JORMRgGUWNN6368ff8\\n+ltN299ctrLMcoC2lHGTz3cj4v0SPgdAH3HYDyTVbfhD0vO2X7W9vIyCAPRHt4f98yNin+0vS9po\\n+18RsXn0AsU/heWSdMEMvkoA1EVXe/6I2Fc8H5L0rKR5YyyzKiIGI2Jw6pSBblYHoEQdh9/2RNvn\\nnJyWdK2k7WUVBqC3ujkOnybpWdsnP+fxiPhrKVUB6LmOwx8Rb0v6Rom11NqCC087oxnlo6bvpR8f\\ndURXH5AU4QeSIvxAUoQfSIrwA0kRfiAp7rdtU3zcuDuPn9bGeMSeH0iK8ANJEX4gKcIPJEX4gaQI\\nP5AU4QeSop8flbnuvLlN23+6a3fT9sUTj5dZTjrs+YGkCD+QFOEHkiL8QFKEH0iK8ANJEX4gKcIP\\nJEX4gaQIP5AU4QeSIvxAUoQfSIrwA0kRfiCpluG3vcb2IdvbR82bbHuj7V3F86TelgmgbO3s+R+R\\ntOCUeXdL2hQRsyVtKl4DGEdahj8iNks6esrsRZLWFtNrJS0uuS4APdbpOf+0iNhfTB+QNK2kegD0\\nSdcX/CIiJEWjdtvLbQ/ZHjp8ZLjb1QEoSafhP2h7uiQVz4caLRgRqyJiMCIGp04Z6HB1AMrWafjX\\nSVpaTC+V9Fw55QDol3a6+p6Q9KKkS2zvtb1M0gpJ19jeJel7xWsA40jL3+2PiCUNmq4uuZZxq9Xv\\nz1fpC3PnNG3/y/rH+1QJ6oY7/ICkCD+QFOEHkiL8QFKEH0iK8ANJMUR3mza8t6Xj91764o+atp//\\nwx0df7YkHbjj2w3bXv/5Q119Ns5c7PmBpAg/kBThB5Ii/EBShB9IivADSRF+ICn6+fugVT/+wKWX\\nNG1fv/GpFmvo/B4E5MWeH0iK8ANJEX4gKcIPJEX4gaQIP5AU4QeSop+/Blr34wPlY88PJEX4gaQI\\nP5AU4QeSIvxAUoQfSIrwA0m1DL/tNbYP2d4+at69tvfZ3lI8Fva2TABla2fP/4ikBWPMfyAi5haP\\n9eWWBaDXWoY/IjZLOtqHWgD0UTfn/Lfb3lqcFkwqrSIAfdFp+FdKuljSXEn7Jd3XaEHby20P2R46\\nfGS4w9UBKFtH4Y+IgxExHBGfSFotaV6TZVdFxGBEDE6dMtBpnQBK1lH4bU8f9fIGSdsbLQugnlp+\\npdf2E5KuknSu7b2SfiXpKttzJYWkPZJu6WGNAHqgZfgjYskYsx/uQS0A+og7/ICkCD+QFOEHkiL8\\nQFKEH0iK8ANJ8dPd6Kn/nDhedQlogD0/kBThB5Ii/EBShB9IivADSRF+ICnCDyRFP38NXHfe3MrW\\n/b/vX960/R+rV3f1+T+5YH5X70fvsOcHkiL8QFKEH0iK8ANJEX4gKcIPJEX4gaTo5++DDe9tqWzd\\nX3/g1qbt5/32n03bu70H4cMfNBzMSV/688tN37ty9lebti+ucLueCdjzA0kRfiApwg8kRfiBpAg/\\nkBThB5Ii/EBSLfv5bc+U9KikaZJC0qqIeND2ZElPSZolaY+kGyPiv70rFZ3YdudDzRe4s9cVNO6L\\nr/J3DNDenv+EpLsiYo6kKyTdZnuOpLslbYqI2ZI2Fa8BjBMtwx8R+yPitWL6mKSdkmZIWiRpbbHY\\nWkmLe1UkgPJ9rnN+27MkXSbpJUnTImJ/0XRAI6cFAMaJtsNv+2xJT0u6IyI+GN0WEaGR6wFjvW+5\\n7SHbQ4ePDHdVLIDytBV+2xM0EvzHIuKZYvZB29OL9umSDo313ohYFRGDETE4dcpAGTUDKEHL8Nu2\\npIcl7YyI+0c1rZO0tJheKum58ssD0CvtfKX3Skk3S9pm+2S/zT2SVkj6k+1lkt6RdGNvSsSZqtVX\\nnVt1BXbTVVjl16zromX4I+IFSW7QfHW55QDoF+7wA5Ii/EBShB9IivADSRF+ICnCDyTFT3ejtrq9\\nDwDNsecHkiL8QFKEH0iK8ANJEX4gKcIPJEX4gaTo58e4xXfyu8OeH0iK8ANJEX4gKcIPJEX4gaQI\\nP5AU4QeSIvxAUoQfSIrwA0kRfiApwg8kRfiBpAg/kBThB5JqGX7bM23/3fYbtnfY/lkx/17b+2xv\\nKR4Le18ugLK082MeJyTdFRGv2T5H0qu2NxZtD0TE73pXHoBeaRn+iNgvaX8xfcz2Tkkzel0YgN76\\nXOf8tmdJukzSS8Ws221vtb3G9qQG71lue8j20OEjw10VC6A8bYff9tmSnpZ0R0R8IGmlpIslzdXI\\nkcF9Y70vIlZFxGBEDE6dMlBCyQDK0Fb4bU/QSPAfi4hnJCkiDkbEcER8Imm1pHm9KxNA2dq52m9J\\nD0vaGRH3j5o/fdRiN0jaXn55AHqlnav9V0q6WdI22yd/K/keSUtsz5UUkvZIuqUnFQLoiXau9r8g\\nyWM0rS+/HAD9wh1+QFKEH0iK8ANJEX4gKcIPJEX4gaQIP5AU4QeSIvxAUoQfSIrwA0kRfiApwg8k\\nRfiBpBwR/VuZfVjSO6NmnSvp/b4V8PnUtba61iVRW6fKrO3CiJjazoJ9Df9pK7eHImKwsgKaqGtt\\nda1LorZOVVUbh/1AUoQfSKrq8K+qeP3N1LW2utYlUVunKqmt0nN+ANWpes8PoCKVhN/2Attv2t5t\\n++4qamjE9h7b24qRh4cqrmWN7UO2t4+aN9n2Rtu7iucxh0mrqLZajNzcZGTpSrdd3Ua87vthv+0B\\nSf+WdI2kvZJekbQkIt7oayEN2N4jaTAiKu8Ttv0dScclPRoRXyvm/UbS0YhYUfzjnBQRv6hJbfdK\\nOl71yM3FgDLTR48sLWmxpB+rwm3XpK4bVcF2q2LPP0/S7oh4OyI+kvSkpEUV1FF7EbFZ0tFTZi+S\\ntLaYXquRP56+a1BbLUTE/oh4rZg+JunkyNKVbrsmdVWiivDPkPTuqNd7Va8hv0PS87Zftb286mLG\\nMK0YNl2SDkiaVmUxY2g5cnM/nTKydG22XScjXpeNC36nmx8R35R0vaTbisPbWoqRc7Y6dde0NXJz\\nv4wxsvSnqtx2nY54XbYqwr9P0sxRr88v5tVCROwrng9Jelb1G3344MlBUovnQxXX86k6jdw81sjS\\nqsG2q9OI11WE/xVJs21fZPssSTdJWldBHaexPbG4ECPbEyVdq/qNPrxO0tJieqmk5yqs5TPqMnJz\\no5GlVfG2q92I1xHR94ekhRq54v+WpF9WUUODur4i6fXisaPq2iQ9oZHDwI81cm1kmaQpkjZJ2iXp\\nb5Im16i2P0raJmmrRoI2vaLa5mvkkH6rpC3FY2HV265JXZVsN+7wA5Ligh+QFOEHkiL8QFKEH0iK\\n8ANJEX4gKcIPJEX4gaT+D8kFqXGPpjBrAAAAAElFTkSuQmCC\\n\",\n      \"text/plain\": [\n       \"<matplotlib.figure.Figure at 0x7f62242c9048>\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"data\": {\n      \"image/png\": \"iVBORw0KGgoAAAANSUhEUgAAAP8AAAD8CAYAAAC4nHJkAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAADG9JREFUeJzt3X+o3XUdx/HXq+si0gI311pzNVtT\\nmlYzLiNzhJHVHMKMQlxmi4ZXSkHDPxIjMuiPEWb0Rw1uOVplWqDmiJHNYYootqus/cxt2sStuc0t\\ncAZRu777434Xt+2e7zk753vO99z7fj7gcL7n+/me+337xde+Pz7f8/04IgQgn7fUXQCAehB+ICnC\\nDyRF+IGkCD+QFOEHkiL8QFKEH0iK8ANJndXLlZ03fSDmzZ3Wy1UCqex75T967dioW1m2o/DbXirp\\nR5IGJP0sIlaXLT9v7jT9+dG5nawSQInFn32l5WXbPuy3PSDpx5KukrRQ0grbC9v9ewB6q5Nz/sWS\\n9kbESxHxb0kPSFpeTVkAuq2T8M+RNP4YY38x7//YHrI9YnvkyNHRDlYHoEpdv9ofEcMRMRgRgzNn\\nDHR7dQBa1En4D0gaf/Xu/GIegEmgk/BvlrTA9gW23yrpOknrqykLQLe13dUXESds3yLpUY119a2N\\niB2VVQagqzrq54+IDZI2VFQLgB7i9l4gKcIPJEX4gaQIP5AU4QeSIvxAUoQfSIrwA0kRfiApwg8k\\nRfiBpAg/kBThB5Ii/EBShB9IivADSRF+ICnCDyRF+IGkCD+QFOEHkiL8QFKEH0iK8ANJEX4gKcIP\\nJEX4gaQIP5AU4QeS6miUXtv7JB2XNCrpREQMVlEUgO7rKPyFT0bEaxX8HQA9xGE/kFSn4Q9Jf7T9\\nnO2hKgoC0BudHvYviYgDtt8laaPtv0bEk+MXKP5RGJKk986p4iwDQBU62vNHxIHi/bCkhyUtnmCZ\\n4YgYjIjBmTMGOlkdgAq1HX7bZ9t+x8lpSZ+RtL2qwgB0VyfH4bMkPWz75N/5dUT8oZKqAHRd2+GP\\niJckfaTCWrrqs+9ZVHcJ6LF937usYdsLX13Tw0r6E119QFKEH0iK8ANJEX4gKcIPJEX4gaTS3G97\\n/V/3l7av3r60R5XgpLlf6O49YWd98PWGbRc/c33pd3dcdl/V5fQd9vxAUoQfSIrwA0kRfiApwg8k\\nRfiBpAg/kFSafv4vv7P8AcNf/vivelRJLh+65+ttf3dg4YWl7aM7d5e2Z+ir7wR7fiApwg8kRfiB\\npAg/kBThB5Ii/EBShB9IKk0//2T2u3+eU9q+ZsEHGrY9+vctVZdzRt5z99MN25rVtuzK8n5+dIY9\\nP5AU4QeSIvxAUoQfSIrwA0kRfiApwg8k1bSf3/ZaSVdLOhwRlxTzpkv6jaR5kvZJujYi/tG9Mqe2\\nZVd8vrR9dPeLParkzDH0+eTVyp7/55JOHdHiDkmbImKBpE3FZwCTSNPwR8STko6dMnu5pHXF9DpJ\\n11RcF4Aua/ecf1ZEHCymX5U0q6J6APRIxxf8IiIkRaN220O2R2yPHDk62unqAFSk3fAfsj1bkor3\\nw40WjIjhiBiMiMGZMwbaXB2AqrUb/vWSVhbTKyU9Uk05AHqlafht3y/pGUkX2d5ve5Wk1ZI+bXuP\\npCuLzwAmkab9/BGxokHTpyquZdK6atkXS9vf3LKzyV/o3378ZVde22SJ8mfn1/08ATTGHX5AUoQf\\nSIrwA0kRfiApwg8kRfiBpHh0d2H/iTdK21e9d0lJa3lX3t8e+HBp++5P/KK0vdnPZvc/eHFJa3lX\\n2wW/v7G0/cKdm0vb529+W2k7+hd7fiApwg8kRfiBpAg/kBThB5Ii/EBShB9IKk0/f9P+7KHy/uyB\\nBe9v2LbhiYearL2+n7U2u3+h6X/3hfNL238y58Ezrgn9gT0/kBThB5Ii/EBShB9IivADSRF+ICnC\\nDyQ1Zfr5m/3m/UKV92d/Y++u0valb+/fR1Cf//kdDdtWqew5BM1t+BP9+FMVe34gKcIPJEX4gaQI\\nP5AU4QeSIvxAUoQfSKppP7/ttZKulnQ4Ii4p5t0l6UZJR4rF7oyIDd0qsgqTeajogYsvKm0f3fFC\\nw7byZ/pLOy67r62aprqL7v1aafu7nx0tbX9ieLjKcrqilT3/zyUtnWD+DyNiUfHq6+ADOF3T8EfE\\nk5KO9aAWAD3UyTn/Lba32l5r+9zKKgLQE+2Gf42k+ZIWSToo6QeNFrQ9ZHvE9siRo+XnSQB6p63w\\nR8ShiBiNiDcl/VTS4pJlhyNiMCIGZ84YaLdOABVrK/y2Z4/7+DlJ26spB0CvtNLVd7+kKySdZ3u/\\npO9IusL2IkkhaZ+km7pYI4AuaBr+iFgxwex7u1BLRyZzP34zGzb+poNvT93t0omFT3+ptH3et58p\\nbf/X1Q3PdCcN7vADkiL8QFKEH0iK8ANJEX4gKcIPJDVlHt0NnKrsce5zm9yX1uxn1JPhJ7vNsOcH\\nkiL8QFKEH0iK8ANJEX4gKcIPJEX4gaTo50ff2vDYb0vbmw3LfvTGyxq2jXx3TZO1T/2fQrPnB5Ii\\n/EBShB9IivADSRF+ICnCDyRF+IGk6OfHpNX8ce1Tv6++E+z5gaQIP5AU4QeSIvxAUoQfSIrwA0kR\\nfiCppuG3Pdf247Z32t5h+9Zi/nTbG23vKd7P7X65AKrSyp7/hKTbI2KhpI9Jutn2Qkl3SNoUEQsk\\nbSo+A5gkmoY/Ig5GxPPF9HFJuyTNkbRc0rpisXWSrulWkQCqd0bn/LbnSbpU0rOSZkXEwaLpVUmz\\nKq0MQFe1HH7b50h6UNJtEfH6+LaICEnR4HtDtkdsjxw5OtpRsQCq01L4bU/TWPDvi4iHitmHbM8u\\n2mdLOjzRdyNiOCIGI2Jw5oyBKmoGUIFWrvZb0r2SdkXEPeOa1ktaWUyvlPRI9eUB6JZWftJ7uaQb\\nJG2zffI3kndKWi3pt7ZXSXpZ0rXdKRFANzQNf0Q8JckNmj9VbTkAeoU7/ICkCD+QFOEHkiL8QFKE\\nH0iK8ANJEX4gKcIPJEX4gaQIP5AU4QeSIvxAUoQfSIrwA0kRfiApwg8kRfiBpAg/kBThB5Ii/EBS\\nhB9IivADSRF+ICnCDyRF+IGkCD+QFOEHkiL8QFKEH0iK8ANJNQ2/7bm2H7e90/YO27cW8++yfcD2\\nluK1rPvlAqjKWS0sc0LS7RHxvO13SHrO9sai7YcRcXf3ygPQLU3DHxEHJR0spo/b3iVpTrcLA9Bd\\nZ3TOb3uepEslPVvMusX2VttrbZ/b4DtDtkdsjxw5OtpRsQCq03L4bZ8j6UFJt0XE65LWSJovaZHG\\njgx+MNH3ImI4IgYjYnDmjIEKSgZQhZbCb3uaxoJ/X0Q8JEkRcSgiRiPiTUk/lbS4e2UCqForV/st\\n6V5JuyLinnHzZ49b7HOStldfHoBuaeVq/+WSbpC0zfaWYt6dklbYXiQpJO2TdFNXKgTQFa1c7X9K\\nkido2lB9OQB6hTv8gKQIP5AU4QeSIvxAUoQfSIrwA0kRfiApwg8kRfiBpAg/kBThB5Ii/EBShB9I\\nivADSTkiercy+4ikl8fNOk/Saz0r4Mz0a239WpdEbe2qsrb3RcTMVhbsafhPW7k9EhGDtRVQol9r\\n69e6JGprV121cdgPJEX4gaTqDv9wzesv06+19WtdErW1q5baaj3nB1Cfuvf8AGpSS/htL7X9gu29\\ntu+oo4ZGbO+zva0YeXik5lrW2j5se/u4edNtb7S9p3ifcJi0mmrri5GbS0aWrnXb9duI1z0/7Lc9\\nIGm3pE9L2i9ps6QVEbGzp4U0YHufpMGIqL1P2PYnJL0h6RcRcUkx7/uSjkXE6uIfznMj4pt9Uttd\\nkt6oe+TmYkCZ2eNHlpZ0jaSvqMZtV1LXtaphu9Wx518saW9EvBQR/5b0gKTlNdTR9yLiSUnHTpm9\\nXNK6Ynqdxv7n6bkGtfWFiDgYEc8X08clnRxZutZtV1JXLeoI/xxJr4z7vF/9NeR3SPqj7edsD9Vd\\nzARmFcOmS9KrkmbVWcwEmo7c3EunjCzdN9uunRGvq8YFv9MtiYiPSrpK0s3F4W1firFztn7qrmlp\\n5OZemWBk6f+pc9u1O+J11eoI/wFJc8d9Pr+Y1xci4kDxfljSw+q/0YcPnRwktXg/XHM9/9NPIzdP\\nNLK0+mDb9dOI13WEf7OkBbYvsP1WSddJWl9DHaexfXZxIUa2z5b0GfXf6MPrJa0spldKeqTGWv5P\\nv4zc3GhkadW87fpuxOuI6PlL0jKNXfF/UdK36qihQV3vl/SX4rWj7tok3a+xw8D/aOzayCpJMyRt\\nkrRH0mOSpvdRbb+UtE3SVo0FbXZNtS3R2CH9VklbiteyurddSV21bDfu8AOS4oIfkBThB5Ii/EBS\\nhB9IivADSRF+ICnCDyRF+IGk/gsKoNZjuCPnqwAAAABJRU5ErkJggg==\\n\",\n      \"text/plain\": [\n       \"<matplotlib.figure.Figure at 0x7f62243c0b38>\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"data\": {\n      \"image/png\": \"iVBORw0KGgoAAAANSUhEUgAAAP8AAAD8CAYAAAC4nHJkAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAADEJJREFUeJzt3X/oXfV9x/Hne1/tQO3AxCxLY7ak\\nRYVMbBxfwjpd6eiq0Ra03bAK6zKQRkaFdpQxcX/MP2Wslf2xCekMTUdnLVhRZvBHpeC6dtWvovHX\\nbGxIMTGaGAdVuq2avvfH96T7Vr/3R+4995779f18wOWe+/mce887h7y+59zzufd+IjORVM+vdF2A\\npG4Yfqkowy8VZfilogy/VJThl4oy/FJRhl8qyvBLRZ0yzY2dtWouN244dZqblEo58OKbvPra8Rhm\\n3bHCHxHbgL8H5oB/ysyb+62/ccOpPHL/hnE2KamPrZe+OPS6I5/2R8Qc8A/AZcBm4JqI2Dzq60ma\\nrnHe828FXsjM/Zn5M+AbwBXtlCVp0sYJ/3pg6TnGwabtl0TEjohYiIiFo8eOj7E5SW2a+NX+zNyZ\\nmfOZOb9m9dykNydpSOOE/xCw9Ord2U2bpBVgnPA/CpwTEZsi4j3A1cA97ZQladJGHurLzLci4nrg\\nfhaH+nZl5jOtVSZposYa58/MPcCelmqRNEV+vFcqyvBLRRl+qSjDLxVl+KWiDL9UlOGXijL8UlGG\\nXyrK8EtFGX6pKMMvFWX4paIMv1SU4ZeKMvxSUYZfKsrwS0UZfqkowy8VZfilogy/VJThl4oy/FJR\\nhl8qyvBLRRl+qSjDLxVl+KWixpqlNyIOAK8Dx4G3MnO+jaIkTd5Y4W/8QWa+2sLrSJoiT/ulosYN\\nfwIPRMRjEbGjjYIkTce4p/0XZ+ahiPh14MGI+M/MfHjpCs0fhR0Av7m+jXcZktow1pE/Mw8190eA\\nu4Cty6yzMzPnM3N+zeq5cTYnqUUjhz8iTo+I955YBi4Bnm6rMEmTNc55+Frgrog48Tr/kpn3tVKV\\npIkbOfyZuR/4YIu1SJoih/qkogy/VJThl4oy/FJRhl8qyvBLRfl523e56w5+qG//f9x+4US3/+Rf\\n/uNEX1+j88gvFWX4paIMv1SU4ZeKMvxSUYZfKsrwS0U5zr8CXLT3U337z9i2v0/vf/d97m/wvREq\\nGt6lt2zp2XfK2ev7PvfeR+5tuxwt4ZFfKsrwS0UZfqkowy8VZfilogy/VJThl4pynH8GXPq+3mPh\\nAGfQbxwf5jaf27Nvz7e/OVJNbfnQk3/Us+/XLvtR3+cO2i/3v/TESDVpkUd+qSjDLxVl+KWiDL9U\\nlOGXijL8UlGGXypq4Dh/ROwCPgEcyczzm7ZVwB3ARuAAcFVm/tfkylzZBo1XDzJ4PHt2x7u//8E7\\ne3e+1P+54+439TfMkf+rwLa3td0APJSZ5wAPNY8lrSADw5+ZDwOvva35CmB3s7wbuLLluiRN2Kjv\\n+ddm5uFm+WVgbUv1SJqSsS/4ZWYC2as/InZExEJELBw9dnzczUlqyajhfyUi1gE090d6rZiZOzNz\\nPjPn16yeG3Fzkto2avjvAbY3y9uBu9spR9K0DAx/RNwOfB84LyIORsS1wM3AxyJiH/CHzWNJK8jA\\ncf7MvKZH10dbrmXFOu/f/rRv/0b29u33e+nqgp/wk4oy/FJRhl8qyvBLRRl+qSjDLxXlT3e3YOOn\\n+w/lvfwXvzfgFWoO9W3618/27T+XR6dUSU0e+aWiDL9UlOGXijL8UlGGXyrK8EtFGX6pKMf5p+B/\\nVvX8lbPSzt3Rfxz/z/e9MKVKavLILxVl+KWiDL9UlOGXijL8UlGGXyrK8EtFOc4/Bc9fe2vXJXRm\\n8/f+pGffBp7u+9wrT3+j7XK0hEd+qSjDLxVl+KWiDL9UlOGXijL8UlGGXypq4Dh/ROwCPgEcyczz\\nm7abgM8CR5vVbszMPZMqcqU79+H+U3j/8MNfm1Il7Rv0b9t0de85Dd647/0DXr3mfAbTMsyR/6vA\\ntmXab8nMLc3N4EsrzMDwZ+bDwGtTqEXSFI3znv/6iNgbEbsi4szWKpI0FaOG/1bgA8AW4DDwpV4r\\nRsSOiFiIiIWjx46PuDlJbRsp/Jn5SmYez8yfA18BtvZZd2dmzmfm/JrVc6PWKallI4U/ItYtefhJ\\nGPD1LEkzZ5ihvtuBjwBnRcRB4G+Aj0TEFiCBA8B1E6xR0gQMDH9mXrNM820TqGXFOnDHBX37N326\\n91g3wMfP/njf/nsfufekaxpWv+/bA2z44/4ndZvo/2/rt2+ev2Dlfr7h3cBP+ElFGX6pKMMvFWX4\\npaIMv1SU4ZeK8qe7W/D87/cfsrrovk/17T9j2/6+/Ze+b8tJ1zSsQT+fPffb5/Xt3/PgHQO24Ndy\\nZ5VHfqkowy8VZfilogy/VJThl4oy/FJRhl8qynH+Kfj3C77Vf4WX+nc/8NNT+/b/NH/1JCv6f6fF\\n//btv+Q0x+nfrTzyS0UZfqkowy8VZfilogy/VJThl4oy/FJRjvOvAJec9uaANQb1S+/kkV8qyvBL\\nRRl+qSjDLxVl+KWiDL9UlOGXihoY/ojYEBHfiYhnI+KZiPh8074qIh6MiH3N/ZmTL1dSW4Y58r8F\\nfDEzNwO/C3wuIjYDNwAPZeY5wEPNY0krxMDwZ+bhzHy8WX4deA5YD1wB7G5W2w1cOakiJbXvpN7z\\nR8RG4ELgB8DazDzcdL0MrG21MkkTNXT4I+IM4E7gC5n5k6V9mZlA9njejohYiIiFo8eOj1WspPYM\\nFf6IOJXF4H89M0/8GuUrEbGu6V8HHFnuuZm5MzPnM3N+zeq5NmqW1IJhrvYHcBvwXGZ+eUnXPcD2\\nZnk7cHf75UmalGG+0nsR8BngqYg48TvONwI3A9+MiGuBHwNXTaZESZMwMPyZ+V0genR/tN1yJE2L\\nn/CTijL8UlGGXyrK8EtFGX6pKMMvFWX4paIMv1SU4ZeKMvxSUYZfKsrwS0UZfqkowy8VZfilogy/\\nVJThl4oy/FJRhl8qyvBLRRl+qSjDLxVl+KWiDL9UlOGXijL8UlGGXyrK8EtFGX6pKMMvFTUw/BGx\\nISK+ExHPRsQzEfH5pv2miDgUEU80t8snX66ktpwyxDpvAV/MzMcj4r3AYxHxYNN3S2b+3eTKkzQp\\nA8OfmYeBw83y6xHxHLB+0oVJmqyTes8fERuBC4EfNE3XR8TeiNgVEWf2eM6OiFiIiIWjx46PVayk\\n9gwd/og4A7gT+EJm/gS4FfgAsIXFM4MvLfe8zNyZmfOZOb9m9VwLJUtqw1Dhj4hTWQz+1zPzWwCZ\\n+UpmHs/MnwNfAbZOrkxJbRvman8AtwHPZeaXl7SvW7LaJ4Gn2y9P0qQMc7X/IuAzwFMR8UTTdiNw\\nTURsARI4AFw3kQolTcQwV/u/C8QyXXvaL0fStPgJP6kowy8VZfilogy/VJThl4oy/FJRhl8qyvBL\\nRRl+qSjDLxVl+KWiDL9UlOGXijL8UlGRmdPbWMRR4MdLms4CXp1aASdnVmub1brA2kbVZm2/lZlr\\nhllxquF/x8YjFjJzvrMC+pjV2ma1LrC2UXVVm6f9UlGGXyqq6/Dv7Hj7/cxqbbNaF1jbqDqprdP3\\n/JK60/WRX1JHOgl/RGyLiOcj4oWIuKGLGnqJiAMR8VQz8/BCx7XsiogjEfH0krZVEfFgROxr7ped\\nJq2j2mZi5uY+M0t3uu9mbcbrqZ/2R8Qc8EPgY8BB4FHgmsx8dqqF9BARB4D5zOx8TDgiPgy8AXwt\\nM89v2v4WeC0zb27+cJ6ZmX81I7XdBLzR9czNzYQy65bOLA1cCfwZHe67PnVdRQf7rYsj/1bghczc\\nn5k/A74BXNFBHTMvMx8GXntb8xXA7mZ5N4v/eaauR20zITMPZ+bjzfLrwImZpTvdd33q6kQX4V8P\\nvLjk8UFma8rvBB6IiMciYkfXxSxjbTNtOsDLwNoui1nGwJmbp+ltM0vPzL4bZcbrtnnB750uzszf\\nAS4DPtec3s6kXHzPNkvDNUPN3Dwty8ws/Qtd7rtRZ7xuWxfhPwRsWPL47KZtJmTmoeb+CHAXszf7\\n8CsnJklt7o90XM8vzNLMzcvNLM0M7LtZmvG6i/A/CpwTEZsi4j3A1cA9HdTxDhFxenMhhog4HbiE\\n2Zt9+B5ge7O8Hbi7w1p+yazM3NxrZmk63nczN+N1Zk79BlzO4hX/HwF/3UUNPep6P/Bkc3um69qA\\n21k8DXyTxWsj1wKrgYeAfcC3gVUzVNs/A08Be1kM2rqOaruYxVP6vcATze3yrvddn7o62W9+wk8q\\nygt+UlGGXyrK8EtFGX6pKMMvFWX4paIMv1SU4ZeK+j/3M70Cl9MZ3AAAAABJRU5ErkJggg==\\n\",\n      \"text/plain\": [\n       \"<matplotlib.figure.Figure at 0x7f6224765f28>\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"data\": {\n      \"image/png\": \"iVBORw0KGgoAAAANSUhEUgAAAP8AAAD8CAYAAAC4nHJkAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAAC6lJREFUeJzt3V2MXHd5gPHn7WKK5HARO65lHINp\\n6iK5kWqqxeIjqkB8OEQIG1FFsVTqSlGNWiIVCVVE6QW5jCog4qJEMsTCQAgghTS+sDDBQqSofGQT\\nGSdOShMi09hxbMdGIukNZHl7scfRkuzOTGbOzBnv+/yk1c6eM7vzauTHZ2bO7P4jM5FUzx91PYCk\\nbhi/VJTxS0UZv1SU8UtFGb9UlPFLRRm/VJTxS0W9ZpI3dsWamdy8adUkb1Iq5cTTv+O5C/MxyHVH\\nij8irgW+AMwAX87M23pdf/OmVfzs8KZRblJSD9t3PD3wdYd+2B8RM8C/Ax8EtgK7I2LrsD9P0mSN\\n8px/O/BkZj6Vmb8FvgnsbGcsSeM2SvwbgcWPMU422/5AROyNiLmImDt3fn6Em5PUprG/2p+Z+zJz\\nNjNn162dGffNSRrQKPGfAha/endls03SJWCU+B8EtkTEmyPitcANwMF2xpI0bkOf6svMFyPiJuAw\\nC6f69mfm8dYmkzRWI53nz8xDwKGWZpE0Qb69VyrK+KWijF8qyvilooxfKsr4paKMXyrK+KWijF8q\\nyvilooxfKsr4paKMXyrK+KWijF8qyvilooxfKsr4paKMXyrK+KWijF8qyvilooxfKsr4paKMXyrK\\n+KWijF8qyvilooxfKmqkVXoj4gTwPDAPvJiZs20MtdLseMO2nvsPP3N0QpNM3tb/+ttl9236m0d7\\nfu9Kvl+mwUjxN96Tmc+18HMkTZAP+6WiRo0/ge9FxEMRsbeNgSRNxqgP+6/JzFMR8SfA/RHx35n5\\nwOIrNP8p7AV448Y2nmVIasNIR/7MPNV8PgvcC2xf4jr7MnM2M2fXrZ0Z5eYktWjo+CNidUS8/uJl\\n4ANA75dvJU2NUR6HrwfujYiLP+cbmfndVqaSNHZDx5+ZTwF/2eIskibIU31SUcYvFWX8UlHGLxVl\\n/FJRxi8V5fttW/CW//y7nvs3c2xCk0yfx9759WX37aD3rzprvDzyS0UZv1SU8UtFGb9UlPFLRRm/\\nVJTxS0V5nr8Fr/vJZV2PsCL9x//1vl93rX5hQpOsTB75paKMXyrK+KWijF8qyvilooxfKsr4paI8\\nz6+pdceWP+u5f5dLeI/EI79UlPFLRRm/VJTxS0UZv1SU8UtFGb9UVN/z/BGxH/gQcDYzr262rQG+\\nBWwGTgDXZ+avxzemVqKDpx7suf/DG982oUlqGuTI/xXg2pdtuxk4kplbgCPN15IuIX3jz8wHgAsv\\n27wTONBcPgDsankuSWM27HP+9Zl5urn8LLC+pXkkTcjIL/hlZgK53P6I2BsRcxExd+78/Kg3J6kl\\nw8Z/JiI2ADSfzy53xczcl5mzmTm7bu3MkDcnqW3Dxn8Q2NNc3gPc1844kialb/wRcTfwY+AtEXEy\\nIm4EbgPeHxFPAO9rvpZ0Cel7nj8zdy+z670tz3LJ+vm/fLHn/h23uw79Uv44Vo30/f906u09939x\\n409G+vkrne/wk4oyfqko45eKMn6pKOOXijJ+qSj/dPcU2PEGTwUO44f/e1XvK3iqryeP/FJRxi8V\\nZfxSUcYvFWX8UlHGLxVl/FJRnuefAjNb/7y7G4/o7rb7mD/+i65HWNE88ktFGb9UlPFLRRm/VJTx\\nS0UZv1SU8UtFeZ5/Chz6/re7HmEq+XcOxssjv1SU8UtFGb9UlPFLRRm/VJTxS0UZv1RU3/P8EbEf\\n+BBwNjOvbrbdCvwDcK652i2ZeWhcQ650/c5nH37m6IQmUSWDHPm/Aly7xPbbM3Nb82H40iWmb/yZ\\n+QBwYQKzSJqgUZ7z3xQRxyJif0Rc3tpEkiZi2PjvAK4CtgGngc8td8WI2BsRcxExd+78/JA3J6lt\\nQ8WfmWcycz4zfw98Cdje47r7MnM2M2fXrZ0Zdk5JLRsq/ojYsOjLjwCPtjOOpEkZ5FTf3cC7gSsi\\n4iTwGeDdEbENSOAE8PExzihpDPrGn5m7l9h85xhmWbFO3vMXPfdf+dHjE5pkuoz6+/rH33FXS5PU\\n5Dv8pKKMXyrK+KWijF8qyvilooxfKso/3a2p9Y9PPNn1CCuaR36pKOOXijJ+qSjjl4oyfqko45eK\\nMn6pKM/zT0C/Xz3dgUtRa/I88ktFGb9UlPFLRRm/VJTxS0UZv1SU8UtFeZ5fnen3J813rXZp8nHy\\nyC8VZfxSUcYvFWX8UlHGLxVl/FJRxi8V1fc8f0RsAr4KrAcS2JeZX4iINcC3gM3ACeD6zPz1+EbV\\nSuMS290a5Mj/IvCpzNwKvB34RERsBW4GjmTmFuBI87WkS0Tf+DPzdGY+3Fx+Hngc2AjsBA40VzsA\\n7BrXkJLa96qe80fEZuCtwE+B9Zl5utn1LAtPCyRdIgaOPyIuA+4BPpmZv1m8LzOThdcDlvq+vREx\\nFxFz587PjzSspPYMFH9ErGIh/Lsy8zvN5jMRsaHZvwE4u9T3Zua+zJzNzNl1a2famFlSC/rGHxEB\\n3Ak8npmfX7TrILCnubwHuK/98SSNyyC/0vsu4GPAIxFx8XcsbwFuA74dETcCvwKuH8+IK9/hZ/zV\\nVU1e3/gz80dALLP7ve2OI2lSfIefVJTxS0UZv1SU8UtFGb9UlPFLRRm/VJTxS0UZv1SU8UtFGb9U\\nlPFLRRm/VJTxS0UZv1SU8UtFGb9UlPFLRRm/VJTxS0UZv1SU8UtFGb9UlPFLRRm/VJTxS0UZv1SU\\n8UtFGb9UlPFLRfWNPyI2RcQPIuKxiDgeEf/cbL81Ik5FxNHm47rxjyupLa8Z4DovAp/KzIcj4vXA\\nQxFxf7Pv9sz87PjGkzQufePPzNPA6eby8xHxOLBx3INJGq9X9Zw/IjYDbwV+2my6KSKORcT+iLh8\\nme/ZGxFzETF37vz8SMNKas/A8UfEZcA9wCcz8zfAHcBVwDYWHhl8bqnvy8x9mTmbmbPr1s60MLKk\\nNgwUf0SsYiH8uzLzOwCZeSYz5zPz98CXgO3jG1NS2wZ5tT+AO4HHM/Pzi7ZvWHS1jwCPtj+epHEZ\\n5NX+dwEfAx6JiKPNtluA3RGxDUjgBPDxsUwoaSwGebX/R0AssetQ++NImhTf4ScVZfxSUcYvFWX8\\nUlHGLxVl/FJRxi8VZfxSUcYvFWX8UlHGLxVl/FJRxi8VZfxSUZGZk7uxiHPArxZtugJ4bmIDvDrT\\nOtu0zgXONqw2Z3tTZq4b5IoTjf8VNx4xl5mznQ3Qw7TONq1zgbMNq6vZfNgvFWX8UlFdx7+v49vv\\nZVpnm9a5wNmG1clsnT7nl9Sdro/8kjrSSfwRcW1E/CIinoyIm7uYYTkRcSIiHmlWHp7reJb9EXE2\\nIh5dtG1NRNwfEU80n5dcJq2j2aZi5eYeK0t3et9N24rXE3/YHxEzwP8A7wdOAg8CuzPzsYkOsoyI\\nOAHMZmbn54Qj4q+BF4CvZubVzbZ/Ay5k5m3Nf5yXZ+anp2S2W4EXul65uVlQZsPilaWBXcDf0+F9\\n12Ou6+ngfuviyL8deDIzn8rM3wLfBHZ2MMfUy8wHgAsv27wTONBcPsDCP56JW2a2qZCZpzPz4eby\\n88DFlaU7ve96zNWJLuLfCDy96OuTTNeS3wl8LyIeioi9XQ+zhPXNsukAzwLruxxmCX1Xbp6kl60s\\nPTX33TArXrfNF/xe6ZrM/Cvgg8Anmoe3UykXnrNN0+magVZunpQlVpZ+SZf33bArXreti/hPAZsW\\nfX1ls20qZOap5vNZ4F6mb/XhMxcXSW0+n+14npdM08rNS60szRTcd9O04nUX8T8IbImIN0fEa4Eb\\ngIMdzPEKEbG6eSGGiFgNfIDpW334ILCnubwHuK/DWf7AtKzcvNzK0nR8303diteZOfEP4DoWXvH/\\nJfCvXcywzFx/Cvy8+Tje9WzA3Sw8DPwdC6+N3AisBY4ATwDfB9ZM0WxfAx4BjrEQ2oaOZruGhYf0\\nx4Cjzcd1Xd93Pebq5H7zHX5SUb7gJxVl/FJRxi8VZfxSUcYvFWX8UlHGLxVl/FJR/w84hqDlbTB+\\n5wAAAABJRU5ErkJggg==\\n\",\n      \"text/plain\": [\n       \"<matplotlib.figure.Figure at 0x7f62243f8978>\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"data\": {\n      \"image/png\": \"iVBORw0KGgoAAAANSUhEUgAAAP8AAAD8CAYAAAC4nHJkAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAADR5JREFUeJzt3V+MHfV5xvHnyWIa1U4kbFzXfzY1\\nJCaVi4TTbq2IoCRVCsYIyU4rufFF6kqIjZoglYqSIveiXKIEQrhIUJdiYSpCiJQgrNaKIVYVGjVN\\nvFDbQFzAcTfC/42NBFaagDdvL3YMC94z5/icOTNn/X4/0mrnzG/mzKvRPjtz5ndmfo4IAcjnfU0X\\nAKAZhB9IivADSRF+ICnCDyRF+IGkCD+QFOEHkiL8QFIX1bmxS+cPxfLhOXVuEkhl4pW39OqpSXey\\nbE/ht329pPskDUn654i4q2z55cNz9NMdw71sEkCJ1Wte6XjZrk/7bQ9J+oaktZJWStpoe2W37weg\\nXr185l8taX9EHIiINyV9W9K6asoC0G+9hH+ppOnnGAeLee9ie9T2uO3xEycne9gcgCr1/Wp/RIxF\\nxEhEjCxcMNTvzQHoUC/hPyRp+tW7ZcU8ALNAL+HfJWmF7ctsXyzpc5K2VVMWgH7ruqsvIs7YvkXS\\nDk119W2JiBcqq+wCsmbJqtL2HYd311QJ8I6e+vkjYruk7RXVAqBGfL0XSIrwA0kRfiApwg8kRfiB\\npAg/kFSt9/NjZp8aHS1t/+HYWE2VIBOO/EBShB9IivADSRF+ICnCDyRF+IGk6OqrweG/u7q0ffjB\\nfTVVAryDIz+QFOEHkiL8QFKEH0iK8ANJEX4gKcIPJEU/fw2u3vDfpe0Td/9fafvBM6dL25ddNO+8\\nawI48gNJEX4gKcIPJEX4gaQIP5AU4QeSIvxAUj3189uekPSGpElJZyJipIqiLjT/tOzHpe1rVD6E\\n900fuqbKcgbGbfvLR3S/7rffqqmSnKr4ks+fRMSrFbwPgBpx2g8k1Wv4Q9KTtp+xXT7sDICB0utp\\n/zURccj270h6yvb/RMTT0xco/imMStKHlnIrATAoejryR8Sh4vdxSY9LWj3DMmMRMRIRIwsXDPWy\\nOQAV6jr8tufa/sDZaUnXSXq+qsIA9Fcv5+GLJD1u++z7fCsivl9JVQD6ruvwR8QBSVdVWEtaOw7v\\nbrqErj3z6zdL2zdfds4nwbfd85E/KF33ly/vL21fP7f8OQcoR1cfkBThB5Ii/EBShB9IivADSRF+\\nICm+b4ue/NFvXVzaXtaNuWZJ+a3M96/4SGn7+lncRToIOPIDSRF+ICnCDyRF+IGkCD+QFOEHkiL8\\nQFL086Mxoy8dKG0fu+LymirJiSM/kBThB5Ii/EBShB9IivADSRF+ICnCDyRFP3+HnvzlnJZt7R5B\\njf5o9zyAfprNj1s/iyM/kBThB5Ii/EBShB9IivADSRF+ICnCDyTVtp/f9hZJN0o6HhFXFvPmS3pM\\n0nJJE5I2RMRr/SuzeV9f92clrS/WVsdMyu6Lb3dP/F+3GQYbF65OjvwPSbr+PfPukLQzIlZI2lm8\\nBjCLtA1/RDwt6dR7Zq+TtLWY3ippfcV1Aeizbj/zL4qII8X0UUmLKqoHQE16vuAXESEpWrXbHrU9\\nbnv8xMnJXjcHoCLdhv+Y7cWSVPw+3mrBiBiLiJGIGFm4YKjLzQGoWrfh3yZpUzG9SdIT1ZQDoC5t\\nw2/7UUk/lvRR2wdt3yTpLknX2n5Z0p8WrwHMIm37+SNiY4umz1Rcy0Db/tRjTZfQlbE27evnnq6l\\nDgwevuEHJEX4gaQIP5AU4QeSIvxAUoQfSIrwA0kRfiApwg8kRfiBpAg/kBThB5Ii/EBShB9IivAD\\nSRF+ICnCDyRF+IGkCD+QFOEHkiL8QFKEH0iK8ANJEX4gKcIPJEX4gaQIP5AU4QeSIvxAUoQfSKrt\\nEN22t0i6UdLxiLiymHenpJslnSgW2xwR2/tVJMp99D/+smXbcu0tXXfNklU9bfvorVeXtu/58jd7\\nen/0TydH/ockXT/D/HsjYlXxQ/CBWaZt+CPiaUmnaqgFQI16+cx/i+29trfYvqSyigDUotvw3y/p\\nw5JWSToi6Z5WC9oetT1ue/zEyckuNwegal2FPyKORcRkRPxG0gOSVpcsOxYRIxExsnDBULd1AqhY\\nV+G3vXjay89Ker6acgDUpZOuvkclfVrSpbYPSvpHSZ+2vUpSSJqQ9IU+1gigD9qGPyI2zjD7wT7U\\ngi69/7/mdb3u0b8t76dv53fv/c/S9jVfb/09gouWLild99920YPcT3zDD0iK8ANJEX4gKcIPJEX4\\ngaQIP5BU264+DL49t5fcNnt7u7V397bxNu9/w7V/0bLtzAsv9rZt9IQjP5AU4QeSIvxAUoQfSIrw\\nA0kRfiApwg8kRT8/+uq1u8+0bPvg2hoLwTk48gNJEX4gKcIPJEX4gaQIP5AU4QeSIvxAUvTzoydr\\n1870ZPd3fHDPvpZtp79/eZt37/FZAyjFkR9IivADSRF+ICnCDyRF+IGkCD+QFOEHkmrbz297WNLD\\nkhZJCkljEXGf7fmSHpO0XNKEpA0R8Vr/SkU3Hn790tL2R35/WY9baN2PL0k7Dpf11dOP36ROjvxn\\nJN0WESslfVzSl2yvlHSHpJ0RsULSzuI1gFmibfgj4khEPFtMv6Gpf/VLJa2TtLVYbKuk9f0qEkD1\\nzuszv+3lkj4m6SeSFkXEkaLpqKY+FgCYJToOv+15kr4r6daIeH16W0SEpq4HzLTeqO1x2+MnTk72\\nVCyA6nQUfttzNBX8RyLie8XsY7YXF+2LJR2fad2IGIuIkYgYWbhgqIqaAVSgbfhtW9KDkvZFxNem\\nNW2TtKmY3iTpierLA9AvndzS+wlJn5f0nO2zfTObJd0l6Tu2b5L0C0kb+lMi2lmzZFXf3nv0pQOl\\n7X8+7/XSdgyutuGPiB9Jcovmz1RbDoC68A0/ICnCDyRF+IGkCD+QFOEHkiL8QFI8uvsCUHbb7GX/\\nenPpuleM7iptH7ui/PHaY6Wt0tDKK1q2bf/Bd9qsjX7iyA8kRfiBpAg/kBThB5Ii/EBShB9IivAD\\nSdHPf4H73xsfKF/gcG/v/8VDHy9t//kfv9SyrdfnEBy99erS9j1f/mZP73+h48gPJEX4gaQIP5AU\\n4QeSIvxAUoQfSIrwA0l5aqSteoxc9f746Y7h2raHwXbVV79Y2r70ofLhvydf635E+PKhw2ev1Wte\\n0fieX7V61P67cOQHkiL8QFKEH0iK8ANJEX4gKcIPJEX4gaTa3s9ve1jSw5IWSQpJYxFxn+07Jd0s\\n6USx6OaI2N6vQnHh2XN7m/vtb+/t/cueF/Cp0dHSdX841m5Egtmvk4d5nJF0W0Q8a/sDkp6x/VTR\\ndm9E3N2/8gD0S9vwR8QRSUeK6Tds75O0tN+FAeiv8/rMb3u5pI9J+kkx6xbbe21vsX1Ji3VGbY/b\\nHj9xcrKnYgFUp+Pw254n6buSbo2I1yXdL+nDklZp6szgnpnWi4ixiBiJiJGFC4YqKBlAFToKv+05\\nmgr+IxHxPUmKiGMRMRkRv5H0gKTV/SsTQNXaht+2JT0oaV9EfG3a/MXTFvuspOerLw9Av3Rytf8T\\nkj4v6TnbZ++D3Cxpo+1Vmur+m5D0hb5UCHSp/LbdC/OW3vPRydX+H0ma6f5g+vSBWYxv+AFJEX4g\\nKcIPJEX4gaQIP5AU4QeSIvxAUoQfSIrwA0kRfiApwg8kRfiBpAg/kBThB5KqdYhu2yck/WLarEsl\\nvVpbAednUGsb1LokautWlbX9XkQs7GTBWsN/zsbt8YgYaayAEoNa26DWJVFbt5qqjdN+ICnCDyTV\\ndPgHeUykQa1tUOuSqK1bjdTW6Gd+AM1p+sgPoCGNhN/29bZftL3f9h1N1NCK7Qnbz9nebXu84Vq2\\n2D5u+/lp8+bbfsr2y8XvGYdJa6i2O20fKvbdbts3NFTbsO1/t/0z2y/Y/ptifqP7rqSuRvZb7af9\\ntockvSTpWkkHJe2StDEiflZrIS3YnpA0EhGN9wnb/qSk05Iejogri3lfkXQqIu4q/nFeEhF/PyC1\\n3SnpdNMjNxcDyiyePrK0pPWS/koN7ruSujaogf3WxJF/taT9EXEgIt6U9G1J6xqoY+BFxNOSTr1n\\n9jpJW4vprZr646ldi9oGQkQciYhni+k3JJ0dWbrRfVdSVyOaCP9SSa9Me31QgzXkd0h60vYztkeb\\nLmYGi4ph0yXpqKRFTRYzg7YjN9fpPSNLD8y+62bE66pxwe9c10TEH0paK+lLxentQIqpz2yD1F3T\\n0cjNdZlhZOm3Nbnvuh3xumpNhP+QpOFpr5cV8wZCRBwqfh+X9LgGb/ThY2cHSS1+H2+4nrcN0sjN\\nM40srQHYd4M04nUT4d8laYXty2xfLOlzkrY1UMc5bM8tLsTI9lxJ12nwRh/eJmlTMb1J0hMN1vIu\\ngzJyc6uRpdXwvhu4Ea8jovYfSTdo6or/zyX9QxM1tKjrckl7ip8Xmq5N0qOaOg18S1PXRm6StEDS\\nTkkvS/qBpPkDVNu/SHpO0l5NBW1xQ7Vdo6lT+r2aGo53d/E31+i+K6mrkf3GN/yApLjgByRF+IGk\\nCD+QFOEHkiL8QFKEH0iK8ANJEX4gqf8H8Gf+Q+zn8YwAAAAASUVORK5CYII=\\n\",\n      \"text/plain\": [\n       \"<matplotlib.figure.Figure at 0x7f6224424978>\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"data\": {\n      \"image/png\": \"iVBORw0KGgoAAAANSUhEUgAAAP8AAAD8CAYAAAC4nHJkAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAACwhJREFUeJzt3V2IXPUZx/Hfr9vYUvUib023MW2s\\nhEIQGssQCkqx+BZDIXoj5kJSKl0vFBREKvaiXoZSlRaqsNZgWqwiqJiLUI1BCNK3bCTmtTVWVky6\\nZtekYLypZn16sScyJruz48w5c058vh8YZuac2T0PQ76Z1+TviBCAfL5U9wAA6kH8QFLEDyRF/EBS\\nxA8kRfxAUsQPJEX8QFLEDyT15UEebMmioVi5YsEgDwmkMv7ux3r/5LS7uW1f8dteJ+k3koYk/T4i\\nNne6/coVC/SPl1b0c0gAHay94d2ub9vz037bQ5J+J+lGSaslbbS9utffB2Cw+nnNv1bSWxHxdkR8\\nJOkZSRvKGQtA1fqJf7mk9ucYR4ttn2F7xPaY7bGpE9N9HA5AmSp/tz8iRiOiFRGtpYuHqj4cgC71\\nE/8xSe3v3l1SbANwHugn/t2SVtm+1PYFkm6VtK2csQBUreeP+iLitO27JL2kmY/6tkTEwdImA1Cp\\nvj7nj4jtkraXNAuAAeLrvUBSxA8kRfxAUsQPJEX8QFLEDyRF/EBSxA8kRfxAUsQPJEX8QFLEDyRF\\n/EBSxA8kRfxAUsQPJEX8QFLEDyRF/EBSxA8kRfxAUsQPJEX8QFLEDyRF/EBSxA8kRfxAUsQPJEX8\\nQFJ9rdJre1zSKUnTkk5HRKuMoQBUr6/4Cz+KiPdL+D0ABoin/UBS/cYfkl62vcf2SBkDARiMfp/2\\nXxURx2x/XdIO2/+MiF3tNyj+UhiRpG8tL+NVBoAy9PXIHxHHivNJSS9IWjvLbUYjohURraWLh/o5\\nHIAS9Ry/7QttX3zmsqTrJR0oazAA1ernefgySS/YPvN7/hQRfy5lKgCV6zn+iHhb0vdKnAXAAPFR\\nH5AU8QNJET+QFPEDSRE/kBTxA0nxfdsG2PO/jzru/+lv7+m4/437Hi1zHCTBIz+QFPEDSRE/kBTx\\nA0kRP5AU8QNJET+QFJ/zN8D4x0s67v/GI3/p/AvuK3EYpMEjP5AU8QNJET+QFPEDSRE/kBTxA0kR\\nP5AU8QNJET+QFPEDSRE/kBTxA0kRP5AU8QNJET+Q1Lzx295ie9L2gbZti2zvsH2kOF9Y7ZhfbEP+\\npOMJqEI3j/xPSlp31rb7Je2MiFWSdhbXAZxH5o0/InZJOnnW5g2SthaXt0q6qeS5AFSs19f8yyJi\\norj8nqRlJc0DYED6fsMvIkJSzLXf9ojtMdtjUyem+z0cgJL0Gv9x28OSVJxPznXDiBiNiFZEtJYu\\nHurxcADK1mv82yRtKi5vkvRiOeMAGJRuPup7WtJfJX3X9lHbt0vaLOk620ckXVtcB3Aemff/7Y+I\\njXPsuqbkWQAMEN/wA5IifiAp4geSIn4gKeIHkiJ+ICniB5IifiAp4geSIn4gKeIHkiJ+ICniB5Ii\\nfiCpef9JL3Jbf+0tHfdPH3qz4/6X/rO3zHFQIh75gaSIH0iK+IGkiB9IiviBpIgfSIr4gaSIH0iK\\n+IGkiB9IiviBpIgfSIr4gaSIH0iK+IGk5v33/La3SPqxpMmIuLzY9qCkn0maKm72QERsr2pI1Ge+\\nf6+P81c3j/xPSlo3y/ZHImJNcSJ84Dwzb/wRsUvSyQHMAmCA+nnNf5ftfba32F5Y2kQABqLX+B+T\\ndJmkNZImJD001w1tj9gesz02dWK6x8MBKFtP8UfE8YiYjohPJD0uaW2H245GRCsiWksXD/U6J4CS\\n9RS/7eG2qzdLOlDOOAAGpZuP+p6WdLWkJbaPSvqlpKttr5EUksYl3VHhjAAqMG/8EbFxls1PVDAL\\nenTDN9fUPQLOQ3zDD0iK+IGkiB9IiviBpIgfSIr4gaRYovsLoMplsPtdohvNxSM/kBTxA0kRP5AU\\n8QNJET+QFPEDSRE/kBTxA0kRP5AU8QNJET+QFPEDSRE/kBTxA0kRP5AU8QNJET+QFPEDSRE/kBTx\\nA0kRP5AU8QNJET+Q1Lzx215h+1Xbh2wftH13sX2R7R22jxTnC6sfF0BZunnkPy3p3ohYLekHku60\\nvVrS/ZJ2RsQqSTuL6wDOE/PGHxETEfF6cfmUpMOSlkvaIGlrcbOtkm6qakgA5ftcr/ltr5R0haS/\\nS1oWERPFrvckLSt1MgCV6jp+2xdJek7SPRHxQfu+iAhJMcfPjdgesz02dWK6r2EBlKer+G0v0Ez4\\nT0XE88Xm47aHi/3DkiZn+9mIGI2IVkS0li4eKmNmACXo5t1+S3pC0uGIeLht1zZJm4rLmyS9WP54\\nAKrSzRLdV0q6TdJ+22fWgn5A0mZJz9q+XdI7kjqv5QygUeaNPyJek+Q5dl9T7jgABoVv+AFJET+Q\\nFPEDSRE/kBTxA0kRP5BUN5/zo2I3fu2/Hfe/vPurA5oEmfDIDyRF/EBSxA8kRfxAUsQPJEX8QFLE\\nDyTF5/wN8BUv6Lj/0eV/G9Ak59r+yrO1HRvV4pEfSIr4gaSIH0iK+IGkiB9IiviBpIgfSIr4gaSI\\nH0iK+IGkiB9IiviBpIgfSIr4gaSIH0hq3vhtr7D9qu1Dtg/avrvY/qDtY7b3Fqf11Y8LoCzd/Gce\\npyXdGxGv275Y0h7bO4p9j0TEr6sbD0BV5o0/IiYkTRSXT9k+LGl51YMBqNbnes1ve6WkKyT9vdh0\\nl+19trfYXjjHz4zYHrM9NnViuq9hAZSn6/htXyTpOUn3RMQHkh6TdJmkNZp5ZvDQbD8XEaMR0YqI\\n1tLFQyWMDKAMXcVve4Fmwn8qIp6XpIg4HhHTEfGJpMclra1uTABl6+bdfkt6QtLhiHi4bftw281u\\nlnSg/PEAVKWbd/uvlHSbpP229xbbHpC00fYaSSFpXNIdlUwIoBLdvNv/miTPsmt7+eMAGBS+4Qck\\nRfxAUsQPJEX8QFLEDyRF/EBSxA8kRfxAUsQPJEX8QFLEDyRF/EBSxA8kRfxAUo6IwR3MnpL0Ttum\\nJZLeH9gAn09TZ2vqXBKz9arM2b4dEUu7ueFA4z/n4PZYRLRqG6CDps7W1LkkZutVXbPxtB9IiviB\\npOqOf7Tm43fS1NmaOpfEbL2qZbZaX/MDqE/dj/wAalJL/LbX2f6X7bds31/HDHOxPW57f7Hy8FjN\\ns2yxPWn7QNu2RbZ32D5SnM+6TFpNszVi5eYOK0vXet81bcXrgT/ttz0k6U1J10k6Kmm3pI0RcWig\\ng8zB9rikVkTU/pmw7R9K+lDSHyLi8mLbrySdjIjNxV+cCyPi5w2Z7UFJH9a9cnOxoMxw+8rSkm6S\\n9BPVeN91mOsW1XC/1fHIv1bSWxHxdkR8JOkZSRtqmKPxImKXpJNnbd4gaWtxeatm/vAM3ByzNUJE\\nTETE68XlU5LOrCxd633XYa5a1BH/cknvtl0/qmYt+R2SXra9x/ZI3cPMYlmxbLokvSdpWZ3DzGLe\\nlZsH6ayVpRtz3/Wy4nXZeMPvXFdFxPcl3SjpzuLpbSPFzGu2Jn1c09XKzYMyy8rSn6rzvut1xeuy\\n1RH/MUkr2q5fUmxrhIg4VpxPSnpBzVt9+PiZRVKL88ma5/lUk1Zunm1laTXgvmvSitd1xL9b0irb\\nl9q+QNKtkrbVMMc5bF9YvBEj2xdKul7NW314m6RNxeVNkl6scZbPaMrKzXOtLK2a77vGrXgdEQM/\\nSVqvmXf8/y3pF3XMMMdc35H0RnE6WPdskp7WzNPAjzXz3sjtkhZL2inpiKRXJC1q0Gx/lLRf0j7N\\nhDZc02xXaeYp/T5Je4vT+rrvuw5z1XK/8Q0/ICne8AOSIn4gKeIHkiJ+ICniB5IifiAp4geSIn4g\\nqf8DR6qDR/5YFEoAAAAASUVORK5CYII=\\n\",\n      \"text/plain\": [\n       \"<matplotlib.figure.Figure at 0x7f62243cda58>\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    }\n   ],\n   \"source\": [\n    \"%matplotlib inline\\n\",\n    \"import skimage\\n\",\n    \"from skimage import transform\\n\",\n    \"for nn in range(10):\\n\",\n    \"    CharacterNum = np.random.randint(len(imagedata))\\n\",\n    \"    FileNum = 7\\n\",\n    \"    im = imagedata[CharacterNum][FileNum]\\n\",\n    \"    im = skimage.transform.resize(im, (28,28))\\n\",\n    \"    plt.figure(); plt.imshow(im)\\n\",\n    \"\\n\",\n    \"print(\\\"Displayed.\\\")\\n\",\n    \"im.dtype\\n\",\n    \"\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 99,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"torch.Size([1, 64, 15, 15]) torch.Size([1, 64, 7, 7]) torch.Size([1, 64, 3, 3]) torch.Size([1, 64, 1, 1])\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"# Let us compute the final size of various possible architectures and input image sizes!\\n\",\n    \"import torch\\n\",\n    \"from torch import autograd\\n\",\n    \"\\n\",\n    \"cv1 = torch.nn.Conv2d(1, 64, 3, stride=2)\\n\",\n    \"#mp1 = torch.nn.MaxPool2d(2, stride=2)\\n\",\n    \"cv2 = torch.nn.Conv2d(64, 64, 3, stride=2)\\n\",\n    \"#mp2 = torch.nn.MaxPool2d(2, stride=2)\\n\",\n    \"cv3 = torch.nn.Conv2d(64, 64, 3, stride=2)\\n\",\n    \"cv4 = torch.nn.Conv2d(64, 64, 3, stride=2)\\n\",\n    \"mp3 = torch.nn.MaxPool2d(2, stride=1)\\n\",\n    \"\\n\",\n    \"fakeim = autograd.Variable(torch.randn(1, 1, 31, 31))\\n\",\n    \"outcv1 = cv1(fakeim)\\n\",\n    \"outcv2 = cv2(outcv1)\\n\",\n    \"outcv3 = cv3(outcv2)\\n\",\n    \"outcv4 = cv4(outcv3)\\n\",\n    \"#outmp3 = mp3(outcv3)\\n\",\n    \"print(outcv1.size(), outcv2.size(), outcv3.size(), outcv4.size())\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 171,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"17 17 [ 152   75 1047] [0 3 0]\\n\"\n     ]\n    },\n    {\n     \"name\": \"stderr\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"/usr/local/lib/python3.5/dist-packages/skimage/transform/_warps.py:84: UserWarning: The default mode, 'constant', will be changed to 'reflect' in skimage 0.15.\\n\",\n      \"  warn(\\\"The default mode, 'constant', will be changed to 'reflect' in \\\"\\n\"\n     ]\n    },\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"<matplotlib.image.AxesImage at 0x7f6223a08748>\"\n      ]\n     },\n     \"execution_count\": 171,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    },\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"<matplotlib.figure.Figure at 0x7f62240af0b8>\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"data\": {\n      \"image/png\": \"iVBORw0KGgoAAAANSUhEUgAAAP8AAAD8CAYAAAC4nHJkAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAADtlJREFUeJzt3X/sVfV9x/HXy68wNzAtv8YQcaC1\\nLqaraL6lOpmzNgqlTdBuMZqtcwkpjamZZm0icXG6NVns5o80TeeKhZUuTuuqVv4gQ2ZZjKkDvyry\\nQ+pQihVEQKwVTUrL1/f+uIfsjnzP/V7uPfeeL76fj4R8z/e8z7nnnRNe33Pv+dzPvY4IAcjnpLob\\nAFAPwg8kRfiBpAg/kBThB5Ii/EBShB9IivADSRF+IKmTu9nZ9kJJ35A0IOk7EXFHq+2nTh6I2bPG\\ndXNIAC3seu3XevOtYbezbcfhtz0g6VuSLpe0W9IztldHxItl+8yeNU4b187q9JAARjFvwWttb9vN\\n0/55kl6OiJ0R8StJD0pa3MXjAeijbsI/U1Lzn5ndxToAJ4Ce3/CzvdT2kO2hAweHe304AG3qJvx7\\nJDW/gD+9WPf/RMTyiBiMiMFpUwa6OByAKnUT/mcknW17ju3xkq6RtLqatgD0Wsd3+yPiiO0bJK1V\\nY6hvZURsq6wzAD3V1Th/RKyRtKaiXgD0Ee/wA5Ii/EBShB9IivADSRF+ICnCDyRF+IGkCD+QFOEH\\nkiL8QFKEH0iK8ANJdTWxByemr+y9oLT2o90fbbnv8594sOp2UBOu/EBShB9IivADSRF+ICnCDyRF\\n+IGkGOpL6PEHLiytnXbnj1vv/HrFzYxi0flXlNaG9+3v6DFPOuWU0tqhz55XWnvqm9/u6HhjFVd+\\nICnCDyRF+IGkCD+QFOEHkiL8QFJdDfXZ3iXpkKRhSUciYrCKptBb4w9F3S20bc3zj5fWrtyxoLR2\\n+MuTSmvvb/1JaW3CwxtKawsenltaW/v6ptLaWFXFOP+nIuLNCh4HQB/xtB9Iqtvwh6THbT9re2kV\\nDQHoj26f9s+PiD22f1vSOts/iYgnmzco/igslaQzZvJuYmCs6OrKHxF7ip/7JT0qad4I2yyPiMGI\\nGJw2ZaCbwwGoUMfhtz3B9qlHlyVdIWlrVY0B6K1unodPl/So7aOP828R8R+VdIWemvrtp0trO775\\nyVH2HjtDWj88e215sXyEsGMLTisf6mtVu2zLe6W1m6fs6KqnbnQc/ojYKal8/iOAMY2hPiApwg8k\\nRfiBpAg/kBThB5LiLXcfUJ+9eHGL6qullZ1//MH6kMoqtZq512qo70e/P6G0dnOfPxC1GVd+ICnC\\nDyRF+IGkCD+QFOEHkiL8QFIM9Z3AlvxsfmntyE/Lh/MAiSs/kBbhB5Ii/EBShB9IivADSRF+ICmG\\n+o5Tq9ly/R9ee7fyR2w1O61XDi65qLT2w9v+sbR2+skTe9FOR77202dKa7fO+UQfO2kfV34gKcIP\\nJEX4gaQIP5AU4QeSIvxAUo6I1hvYKyV9TtL+iPhYsW6ypO9Lmi1pl6SrI+Lnox1s8LxTYuPaWV22\\njKNaDcsd/kz58NJ/rbivF+201IshxLf/vHyIcP3ff6O09lsnja+8l7Fi3oLXNPTCL93Otu1c+b8r\\naeEx65ZJeiIizpb0RPE7gBPIqOGPiCclvXXM6sWSVhXLqyRdWXFfAHqs09f80yNib7H8hhpf1w3g\\nBNL1Db9o3DQovXFge6ntIdtDBw4Od3s4ABXpNPz7bM+QpOLn/rINI2J5RAxGxOC0KQMdHg5A1ToN\\n/2pJ1xXL10l6rJp2APTLqLP6bD8g6VJJU23vlnSbpDskPWR7iRpf/HZ1L5vE8atjOK+VVt9z18o5\\nK64vrc2+9enS2lXfm1d5Lx80o4Y/Iq4tKX264l4A9BHv8AOSIvxAUoQfSIrwA0kRfiApPsBzjKvj\\nAzXHkpeW3FteXFJeanXeWtUyDQNy5QeSIvxAUoQfSIrwA0kRfiApwg8kxVDfGPD5ly9vUT1QWnn/\\nj85vsV+eIauR/OJPLyytfej+/+5jJ2MXV34gKcIPJEX4gaQIP5AU4QeSIvxAUgz1jQHvXVI+nNfK\\nugf+peJOkAlXfiApwg8kRfiBpAg/kBThB5Ii/EBS7XxX30pJn5O0PyI+Vqy7XdIX9X9Tzm6JiDW9\\navJEseilRaW14U+93tFj9vsDJbv5wNBMH375QdDOlf+7khaOsP6eiJhb/EsffOBEM2r4I+JJSW/1\\noRcAfdTNa/4bbG+2vdL2pMo6AtAXnYb/XklnSZoraa+ku8o2tL3U9pDtoQMHhzs8HICqdRT+iNgX\\nEcMR8b6k+yTNa7Ht8ogYjIjBaVMGOu0TQMU6Cr/tGU2/XiVpazXtAOiXdob6HpB0qaSptndLuk3S\\npbbnSgpJuyR9qYc9AuiBUcMfEdeOsHpFD3o54XU6lv/6V/+gRbW/Y+ete5FOu/PHpbWx9AWYfELv\\n6HiHH5AU4QeSIvxAUoQfSIrwA0kRfiApPr33OC2c88kW1cOllR3fKt9v51X/1EVH1dryV617OXvq\\n9aW1M5c9XXU7Lc3f/PnS2gTtLK0x9biBKz+QFOEHkiL8QFKEH0iK8ANJEX4gKYb6jlMcLh/O27Os\\nfEbcWBrO68ZJc97r6/FW/OJ3SmsTFpYP5w2c85EWj8pQn8SVH0iL8ANJEX4gKcIPJEX4gaQIP5AU\\nQ30jOGdF+cy12Sqfubb1Lz8Yw3mtvPSH3yutLTr36tLagtPKH3PX1y4qrc2+tbOZgmvW/6Cj/TLh\\nyg8kRfiBpAg/kBThB5Ii/EBSo4bf9izb622/aHub7RuL9ZNtr7O9o/g5qfftAqiKI6L1Bo1v5J0R\\nEc/ZPlXSs5KulPQXkt6KiDtsL5M0KSJubvVYg+edEhvXzqqm8y499O6HSmsrPjqno8fM/sGQPzvy\\nbmnti2fMr/x42c/3SOYteE1DL/zS7Ww76pU/IvZGxHPF8iFJ2yXNlLRY0qpis1Vq/EEAcII4rtf8\\ntmdLOl/SBknTI2JvUXpD0vRKOwPQU22H3/ZESQ9Luiki3mmuReO1w4ivH2wvtT1ke+jAweGumgVQ\\nnbbCb3ucGsG/PyIeKVbvK+4HHL0vsH+kfSNieUQMRsTgtCkDVfQMoALt3O23pBWStkfE3U2l1ZKu\\nK5avk/RY9e0B6JV2JvZcLOkLkrbYPnp79RZJd0h6yPYSSa9KKp/VAWDMGTX8EfGUpLKhg09X207/\\nMJxXvTNOnlha47yNPbzDD0iK8ANJEX4gKcIPJEX4gaQIP5BU2g/wZOgJ2XHlB5Ii/EBShB9IivAD\\nSRF+ICnCDyRF+IGkCD+QFOEHkiL8QFKEH0iK8ANJEX4gKcIPJEX4gaQIP5AU4QeSIvxAUoQfSIrw\\nA0m18y29s2yvt/2i7W22byzW3257j+1Nxb9FvW8XQFXa+fTeI5K+EhHP2T5V0rO21xW1eyLizt61\\nB6BX2vmW3r2S9hbLh2xvlzSz140B6K3jes1ve7ak8yVtKFbdYHuz7ZW2J5Xss9T2kO2hAweHu2oW\\nQHXaDr/tiZIelnRTRLwj6V5JZ0maq8Yzg7tG2i8ilkfEYEQMTpsyUEHLAKrQVvhtj1Mj+PdHxCOS\\nFBH7ImI4It6XdJ+keb1rE0DV2rnbb0krJG2PiLub1s9o2uwqSVurbw9Ar7Rzt/9iSV+QtMX20S+4\\nu0XStbbnSgpJuyR9qScdAuiJdu72PyXJI5TWVN8OgH7hHX5AUoQfSIrwA0kRfiApwg8kRfiBpAg/\\nkBThB5Ii/EBShB9IivADSRF+ICnCDyRF+IGkCD+QFOEHkiL8QFKEH0iK8ANJEX4gKcIPJEX4gaQI\\nP5AU4QeSIvxAUoQfSKqdL+o8xfZG2y/Y3mb7b4v1c2xvsP2y7e/bHt/7dgFUpZ0r/2FJl0XEeZLm\\nSlpo+0JJX5d0T0R8RNLPJS3pXZsAqjZq+KPh3eLXccW/kHSZpB8U61dJurInHQLoibZe89seKL6e\\ne7+kdZJekfR2RBwpNtktaWbJvkttD9keOnBwuIqeAVSgrfBHxHBEzJV0uqR5kn6v3QNExPKIGIyI\\nwWlTBjpsE0DVjutuf0S8LWm9pIskfdj2yUXpdEl7Ku4NQA+1c7d/mu0PF8u/KelySdvV+CPwJ8Vm\\n10l6rFdNAqieI6L1BvbH1bihN6DGH4uHIuLvbJ8p6UFJkyU9L+nPIuLwKI91QNKrxa9TJb3ZXfuV\\nGkv90MvI6GVkzb38bkRMa2enUcPfK7aHImKwloOPYCz1Qy8jo5eRddoL7/ADkiL8QFJ1hn95jcce\\nyVjqh15GRi8j66iX2l7zA6gXT/uBpGoJv+2Ftl8qZgQuq6OHpl522d5ie5PtoT4fe6Xt/ba3Nq2b\\nbHud7R3Fz0k19nK77T3Fudlke1Gfeplle73tF4uZpDcW6/t+blr00vdzU/kM24jo6z813i/wiqQz\\nJY2X9IKkc/vdR1M/uyRNrenYl0i6QNLWpnX/IGlZsbxM0tdr7OV2SV+t4bzMkHRBsXyqpP+RdG4d\\n56ZFL30/N5IsaWKxPE7SBkkXSnpI0jXF+n+WdH07j1fHlX+epJcjYmdE/EqNNwotrqGP2kXEk5Le\\nOmb1YjXeVCX1cbZkSS+1iIi9EfFcsXxIjXeUzlQN56ZFL30XDZXNsK0j/DMlvdb0e+mMwD4JSY/b\\nftb20hr7OGp6ROwtlt+QNL3OZiTdYHtz8bKgLy9BmtmeLel8Na5ytZ6bY3qRajg33cywPRY3/KT5\\nEXGBpM9I+rLtS+pu6KhoPI+rczjmXklnqfEhLnsl3dXPg9ueKOlhSTdFxDvNtX6fmxF6qeXcRBcz\\nbI9VR/j3SJrV9HutMwIjYk/xc7+kR9U4oXXaZ3uGJBU/99fVSETsK/6zvS/pPvXx3Ngep0bY7o+I\\nR4rVtZybkXqp89wUx+96hm0d4X9G0tnFHcrxkq6RtLqGPmR7gu1Tjy5LukLS1tZ79dxqNWZJSjXP\\nljwatMJV6tO5sW1JKyRtj4i7m0p9PzdlvdRxbiqfYdvPu5VNdy0XqXHX9BVJf11HD0UfZ6ox2vCC\\npG397kXSA2o8Zfy1Gq/VlkiaIukJSTsk/aekyTX28q+StkjarEbwZvSpl/lqPKXfLGlT8W9RHeem\\nRS99PzeSPq7GDNrNavyx+Zum/8cbJb0s6d8l/UY7j8c7/ICkuOEHJEX4gaQIP5AU4QeSIvxAUoQf\\nSIrwA0kRfiCp/wWs7N+iHs8dGwAAAABJRU5ErkJggg==\\n\",\n      \"text/plain\": [\n       \"<matplotlib.figure.Figure at 0x7f6223c23128>\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"data\": {\n      \"image/png\": \"iVBORw0KGgoAAAANSUhEUgAAAP8AAAD8CAYAAAC4nHJkAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAADL5JREFUeJzt3X+o3fV9x/Hna2ncVnXUxCxkMVuq\\nk65SbJRLaFFKf8yaZoMoG0VhxYGQMiYodLDQweb2lx1Ttz+KI06pDGfrVsXARJuJIHZDvboYo1nn\\nD9JpGpMYK7r+0db43h/nG7gL9+Ye7/mecxI/zwcc7vd8f5zviy/3db/n+/2e7z2pKiS15xemHUDS\\ndFh+qVGWX2qU5ZcaZfmlRll+qVGWX2qU5ZcaZfmlRn1olIWTbAL+DlgG/ENV3XSi+c9esazWr1s+\\nyiolncC+V3/OG28ezTDzLrn8SZYB3wQuA14Dnkqyo6peWGiZ9euW8+TD65a6SkmL2Hj5q0PPO8rb\\n/o3AS1X1SlX9DPg2sGWE15M0QaOUfy0w98/Ma904SaeAsZ/wS7I1yWyS2cNHjo57dZKGNEr59wNz\\nD+DP6cb9P1W1vapmqmpm1cplI6xOUp9GKf9TwPlJPprkNOAqYEc/sSSN25LP9lfVu0muAx5mcKnv\\nzqp6vrdkksZqpOv8VfUg8GBPWSRNkJ/wkxpl+aVGWX6pUZZfapTllxpl+aVGWX6pUZZfapTllxpl\\n+aVGWX6pUZZfapTllxpl+aVGWX6pUZZfapTllxpl+aVGWX6pUZZfapTllxpl+aVGWX6pUZZfapTl\\nlxpl+aVGjfR1XUn2Ae8AR4F3q2qmj1CSxm+k8nc+V1Vv9PA6kibIt/1So0YtfwHfS/J0kq19BJI0\\nGaO+7b+0qvYn+VVgZ5L/qqrH5s7Q/VHYCvDra/s4ypDUh5H2/FW1v/t5CLgf2DjPPNuraqaqZlat\\nXDbK6iT1aMnlT3J6kjOPDQNfBPb0FUzSeI3yPnw1cH+SY6/zT1X1UC+pJI3dkstfVa8An+wxi6QJ\\n8lKf1CjLLzXK8kuNsvxSoyy/1CjLLzXK8kuNsvxSoyy/1CjLLzXK8kuNsvxSo5r97xqX/9qGaUfQ\\nEP52378vOO3jp314gkk+eNzzS42y/FKjLL/UKMsvNcryS42y/FKjmr3U95OHzp12BHVO3/TKtCM0\\nyT2/1CjLLzXK8kuNsvxSoyy/1CjLLzVq0Ut9Se4Efhc4VFWf6MatAL4DrAf2AV+uqh+PL2b/Hr/w\\nvmlHUOdyvMNyGobZ838L2HTcuG3AI1V1PvBI91zSKWTR8lfVY8Cbx43eAtzVDd8FXNFzLkljttRj\\n/tVVdaAbfp3B13VLOoWMfMKvqgqohaYn2ZpkNsns4SNHR12dpJ4stfwHk6wB6H4eWmjGqtpeVTNV\\nNbNq5bIlrk5S35Za/h3ANd3wNcAD/cSRNCmLlj/JPcB/AB9L8lqSa4GbgMuSvAj8dvdc0ilk0ev8\\nVXX1ApO+0HMWSRPkJ/ykRll+qVGWX2qU5ZcaZfmlRll+qVGWX2qU5ZcaZfmlRll+qVGWX2qU5Zca\\n1ex39WmyvvnWuiUt9/HTPtxzEh3jnl9qlOWXGmX5pUZZfqlRll9qlOWXGuWlPk3EjgtWTjuCjuOe\\nX2qU5ZcaZfmlRll+qVGWX2qU5ZcaNcx39d2Z5FCSPXPG3Zhkf5Jd3WPzeGNK6tswe/5vAZvmGX9r\\nVW3oHg/2G0vSuC1a/qp6DHhzAlkkTdAox/zXJdndHRac1VsiSROx1PLfBpwHbAAOADcvNGOSrUlm\\nk8wePnJ0iauT1Lcllb+qDlbV0ap6D7gd2HiCebdX1UxVzaxauWypOSX1bEnlT7JmztMrgT0LzSvp\\n5LToXX1J7gE+C5yd5DXgL4DPJtkAFLAP+OoYM0oag0XLX1VXzzP6jjFkkTRBfsJPapTllxpl+aVG\\nWX6pUZZfapTllxpl+aVGWX6pUZZfapTllxpl+aVGWX6pUZZfapTllxpl+aVGWX6pUZZfapTllxpl\\n+aVGWX6pUZZfapTllxpl+aVGWX6pUZZfapTllxq1aPmTrEvyaJIXkjyf5Ppu/IokO5O82P08a/xx\\nJfVl0e/qA94FvlZVzyQ5E3g6yU7gD4FHquqmJNuAbcCfji/q/D797O8tOO0TK15fcNrt674/jjjS\\nKWPRPX9VHaiqZ7rhd4C9wFpgC3BXN9tdwBXjCimpf+/rmD/JeuAi4AlgdVUd6Ca9DqzuNZmksRq6\\n/EnOAL4L3FBVb8+dVlUF1ALLbU0ym2T28JGjI4WV1J+hyp9kOYPi311V93WjDyZZ001fAxyab9mq\\n2l5VM1U1s2rlsj4yS+rBMGf7A9wB7K2qW+ZM2gFc0w1fAzzQfzxJ4zLM2f5LgK8AzyXZ1Y37OnAT\\ncG+Sa4EfAl8eT0RJ47Bo+avqcSALTP5Cv3Hev1/50ssLTvufEyx3ORv6D/MB8JOHzl3yso9feN/i\\nM+mk4Sf8pEZZfqlRll9qlOWXGmX5pUZZfqlRw1znP6k9/KNdC07b/IPNE0wyHkc/96OJru/0Ta8s\\neVkvn55a3PNLjbL8UqMsv9Qoyy81yvJLjbL8UqNO+Ut9J/Lgxx6cdoTRTfZK39j8ziVbFpz2r9/3\\nX0FMg3t+qVGWX2qU5ZcaZfmlRll+qVGWX2rUB/pSn04eXs47+bjnlxpl+aVGWX6pUZZfapTllxpl\\n+aVGDfMtveuSPJrkhSTPJ7m+G39jkv1JdnWPU/+/ZUoNGeY6/7vA16rqmSRnAk8n2dlNu7Wq/mZ8\\n8SSNyzDf0nsAONANv5NkL7B23MEkjdf7OuZPsh64CHiiG3Vdkt1J7kxy1gLLbE0ym2T28JGjI4WV\\n1J+hy5/kDOC7wA1V9TZwG3AesIHBO4Ob51uuqrZX1UxVzaxauayHyJL6MFT5kyxnUPy7q+o+gKo6\\nWFVHq+o94HZg4/hiSurbMGf7A9wB7K2qW+aMXzNntiuBPf3HkzQuw5ztvwT4CvBckmNfjPd14Ook\\nG4AC9gFfHUtCSWMxzNn+x4HMM+kD8K9xpXb5CT+pUZZfapTllxpl+aVGWX6pUZZfapTllxpl+aVG\\nWX6pUZZfapTllxpl+aVGWX6pUZZfapTllxpl+aVGWX6pUZZfapTllxpl+aVGWX6pUZZfapTllxpl\\n+aVGWX6pUZZfatQwX9T5S0meTPJskueT/GU3/qNJnkjyUpLvJDlt/HEl9WWYPf9Pgc9X1SeBDcCm\\nJJ8CvgHcWlW/CfwYuHZ8MSX1bdHy18D/dk+Xd48CPg/8Szf+LuCKsSSUNBZDHfMnWdZ9PfchYCfw\\nMvBWVb3bzfIasHaBZbcmmU0ye/jI0T4yS+rBUOWvqqNVtQE4B9gI/NawK6iq7VU1U1Uzq1YuW2JM\\nSX17X2f7q+ot4FHg08BHknyom3QOsL/nbJLGaJiz/auSfKQb/mXgMmAvgz8Cv9/Ndg3wwLhCSupf\\nqurEMyQXMjiht4zBH4t7q+qvkpwLfBtYAfwn8AdV9dNFXusw8MPu6dnAG6PF79XJlMcs8zPL/OZm\\n+Y2qWjXMQouWf1ySzFbVzFRWPo+TKY9Z5meW+S01i5/wkxpl+aVGTbP826e47vmcTHnMMj+zzG9J\\nWaZ2zC9punzbLzVqKuVPsinJD7o7ArdNI8OcLPuSPJdkV5LZCa/7ziSHkuyZM25Fkp1JXux+njXF\\nLDcm2d9tm11JNk8oy7okjyZ5obuT9Ppu/MS3zQmyTHzb9H6HbVVN9MHg8wIvA+cCpwHPAhdMOsec\\nPPuAs6e07s8AFwN75oz7a2BbN7wN+MYUs9wI/MkUtssa4OJu+Ezgv4ELprFtTpBl4tsGCHBGN7wc\\neAL4FHAvcFU3/u+BPxrm9aax598IvFRVr1TVzxh8UGjLFHJMXVU9Brx53OgtDD5UBRO8W3KBLFNR\\nVQeq6plu+B0GnyhdyxS2zQmyTFwN9HaH7TTKvxZ4dc7zBe8InJACvpfk6SRbp5jjmNVVdaAbfh1Y\\nPc0wwHVJdneHBRM5BJkryXrgIgZ7ualum+OywBS2zSh32B7PE35waVVdDHwJ+OMkn5l2oGNq8D5u\\nmpdjbgPOY/BPXA4AN09y5UnOAL4L3FBVb8+dNultM0+WqWybGuEO2+NNo/z7gXVznk/1jsCq2t/9\\nPATcz2CDTtPBJGsAup+HphWkqg52v2zvAbczwW2TZDmDst1dVfd1o6eybebLMs1t061/5Dtsp1H+\\np4DzuzOUpwFXATumkIMkpyc589gw8EVgz4mXGrsdDO6ShCnfLXmsaJ0rmdC2SRLgDmBvVd0yZ9LE\\nt81CWaaxbXq/w3aSZyvnnLXczOCs6cvAn00jQ5fjXAZXG54Fnp90FuAeBm8Zf87gWO1aYCXwCPAi\\n8G/Aiilm+UfgOWA3g+KtmVCWSxm8pd8N7Ooem6exbU6QZeLbBriQwR20uxn8sfnzOb/HTwIvAf8M\\n/OIwr+cn/KRGecJPapTllxpl+aVGWX6pUZZfapTllxpl+aVGWX6pUf8HFoc1jdcuwZUAAAAASUVO\\nRK5CYII=\\n\",\n      \"text/plain\": [\n       \"<matplotlib.figure.Figure at 0x7f6223b5e470>\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"data\": {\n      \"image/png\": \"iVBORw0KGgoAAAANSUhEUgAAAP8AAAD8CAYAAAC4nHJkAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAADllJREFUeJzt3XGMHOV5x/Hv06shKSAFG2M5xtRA\\nIQ1KwaZXiwgakZCA60Y1VBWFNimtKI5akIKU/oGo2tBUQiQqoFRtiUxwIBEhIQWEVbkFaiFR1BZz\\nUDAGl2I7RtgY25ikkD9C8PH0j52TrtbN3rI7u2P7/X6k087OO7PzMPi3szvvvDuRmUgqz8+1XYCk\\ndhh+qVCGXyqU4ZcKZfilQhl+qVCGXyqU4ZcKZfilQv38ICtHxArg68AY8M3MvLnb8ifMHcsli+cM\\nsklJXex49V3eeHMyelm27/BHxBjw98BngJ3AUxGxLjNfrFtnyeI5bHx4cb+blDSL5Re/2vOyg3zs\\nXw5szcztmfkz4HvAqgFeT9IIDRL+RcD0t5md1TxJh4Ghn/CLiNURMRERE/v2Tw57c5J6NEj4dwHT\\nv8CfVM37fzJzTWaOZ+b4/HljA2xOUpMGCf9TwOkRcUpEHAVcDqxrpixJw9b32f7MPBAR1wIP0+nq\\nW5uZLzRWmaShGqifPzPXA+sbqkXSCHmFn1Qowy8VyvBLhTL8UqEMv1Qowy8VyvBLhTL8UqEMv1Qo\\nwy8VyvBLhTL8UqEMv1Qowy8VyvBLhTL8UqEMv1Qowy8VyvBLhTL8UqEMv1Qowy8VyvBLhTL8UqEM\\nv1Qowy8VaqDbdUXEDuBtYBI4kJnjTRQlafgGCn/lk5n5RgOvI2mE/NgvFWrQ8CfwSEQ8HRGrmyhI\\n0mgM+rH//MzcFREnAo9GxH9n5uPTF6jeFFYDnLyoiW8Zkpow0JE/M3dVj3uBB4HlMyyzJjPHM3N8\\n/ryxQTYnqUF9hz8ijomI46amgYuAzU0VJmm4BvkcvgB4MCKmXue7mfkvjVQlaej6Dn9mbgfObrAW\\nSSNkV59UKMMvFcrwS4Uy/FKhDL9UKMMvFcrwS4Uy/FKhDL9UKMMvFcrwS4Uy/FKh/HUNNWblSytr\\n2xZ+8K3atjtPfmIY5WgWHvmlQhl+qVCGXyqU4ZcKZfilQhl+qVB29R0Cdh74SW3bVSef3/j2Hn7t\\n2b7XvfjDS7u0vlbbsrPLWhed94e1bY/84K7ZSlKfPPJLhTL8UqEMv1Qowy8VyvBLhTL8UqFm7eqL\\niLXAZ4G9mfmxat5c4PvAEmAHcFlm/mh4ZR7Z+u3O69Zl161L7iP/9ge1bS/9+rf7qgXgIxNz6tt+\\n4fXatnVn1v93TOZ7tW1j4bFrEL3svbuAFQfNux7YkJmnAxuq55IOI7OGPzMfB948aPYq4O5q+m7g\\nkobrkjRk/X5uWpCZu6vp1+ncrlvSYWTgL02ZmUDWtUfE6oiYiIiJffsnB92cpIb0G/49EbEQoHrc\\nW7dgZq7JzPHMHJ8/b6zPzUlqWr/hXwdcWU1fCTzUTDmSRqWXrr57gQuAEyJiJ/Bl4Gbgvoi4CngF\\nuGyYRQ5D99Fp/Ylf+5XatveOqv/UE3QZZbfhpC5b7G903pLf3VTbdsqaq7uuewZP1bb97Yfr27pZ\\nx7zatpWLzqlt63d0YrdRlDft+XRt2z8s+s++tneomjX8mXlFTdOFDdciaYS8SkIqlOGXCmX4pUIZ\\nfqlQhl8qVLE/4PlbL+7va731F51V23bgqedr217+1q/Wtn1064n12/voP/VWWEPOWN29u27/H3+8\\nS2v/Pww6St1HUf60vqn+90kPSx75pUIZfqlQhl8qlOGXCmX4pUIZfqlQxXb1XfOhV/tbb2P9et1G\\nCp7xR0/Xtq0f4N55dbZ/t76WU3+v/+1NfOX2vtcdpQuuqh+deHSXkYnbv3b4d2X2yiO/VCjDLxXK\\n8EuFMvxSoQy/VCjDLxWq2K6+I12/3Xn9/ijmIHb8dX332pK/+I++XvPof67vzhubP7+27eXPHR5d\\nmU3wyC8VyvBLhTL8UqEMv1Qowy8VyvBLhYrOHba7LBCxFvgssDczP1bNuxG4GthXLXZDZq6fbWPj\\nZ38gNz68eKCC2/ab562qbTvww1dq29roQjsSDOOeikfy/4vlF7/KxHM/jV6W7eXIfxewYob5t2Xm\\n0upv1uBLOrTMGv7MfBx4cwS1SBqhQb7zXxsRmyJibUQc31hFkkai3/DfDpwGLAV2A7fULRgRqyNi\\nIiIm9u2f7HNzkprWV/gzc09mTmbme8AdwPIuy67JzPHMHJ8/b6zfOiU1rK/wR8TCaU8vBTY3U46k\\nUZl1VF9E3AtcAJwQETuBLwMXRMRSIIEdwBeGWKOkIZg1/Jl5xQyz7xxCLUe0YfRXq96nN7/ddgmH\\nPK/wkwpl+KVCGX6pUIZfKpThlwpl+KVC+eu979PZD/ywtu3pZfXvpUfyMFIdnjzyS4Uy/FKhDL9U\\nKMMvFcrwS4Uy/FKh7Op7n25asKm27WIcuafDh0d+qVCGXyqU4ZcKZfilQhl+qVCGXyqUXX0jcv6m\\n365te+KsB0ZYidThkV8qlOGXCmX4pUIZfqlQhl8q1Kzhj4jFEfFYRLwYES9ExBer+XMj4tGIeLl6\\nPH745UpqSi9H/gPAlzLzTOBc4JqIOBO4HtiQmacDG6rnkg4Ts4Y/M3dn5jPV9NvAFmARsAq4u1rs\\nbuCSYRUpqXnv6zt/RCwBlgFPAgsyc3fV9DqwoNHKJA1Vz+GPiGOB+4HrMvOt6W2ZmUDWrLc6IiYi\\nYmLf/smBipXUnJ7CHxFz6AT/nsycuhZ1T0QsrNoXAntnWjcz12TmeGaOz5831kTNkhrQy9n+AO4E\\ntmTmrdOa1gFXVtNXAg81X56kYellYM95wOeB5yNi6p5TNwA3A/dFxFXAK8BlwylR0jDMGv7MfAKI\\nmuYLmy3nyHXMiu31ja+Nrg5pilf4SYUy/FKhDL9UKMMvFcrwS4Uy/FKhDL9UKMMvFcrwS4Uy/FKh\\nDL9UKMMvFcrwS4Uy/FKhDL9UKMMvFcrwS4Uy/FKhDL9UKMMvFcrwS4Uy/FKhDL9UKMMvFcrwS4Uy\\n/FKhDL9UqFnv1RcRi4FvAwuABNZk5tcj4kbgamBftegNmbl+WIUeKlacPN6l9UBty7Z7lnVZ79ku\\nbdJw9HKX3gPAlzLzmYg4Dng6Ih6t2m7LzL8ZXnmShqWXu/TuBnZX029HxBZg0bALkzRc7+s7f0Qs\\nAZYBT1azro2ITRGxNiKOr1lndURMRMTEvv2TAxUrqTk9hz8ijgXuB67LzLeA24HTgKV0PhncMtN6\\nmbkmM8czc3z+vLEGSpbUhJ7CHxFz6AT/nsx8ACAz92TmZGa+B9wBLB9emZKaNmv4IyKAO4EtmXnr\\ntPkLpy12KbC5+fIkDUsvZ/vPAz4PPB8RU31SNwBXRMRSOt1/O4AvDKXCFiy76U9r20488O+1bf/7\\n++fWtm395DcGqklqWi9n+58AYoamI75PXzqSeYWfVCjDLxXK8EuFMvxSoQy/VKheuvqOSBvfebe2\\n7cS/q+/O2/qd+tF52y60O0+HD4/8UqEMv1Qowy8VyvBLhTL8UqEMv1SoYrv6Xjsw4w8PARBzjqpt\\n23bht4ZRjjRyHvmlQhl+qVCGXyqU4ZcKZfilQhl+qVDFdvVdcsxP6tte2TjCSqR2eOSXCmX4pUIZ\\nfqlQhl8qlOGXCmX4pUL1cqPOD0TExoh4LiJeiIi/quafEhFPRsTWiPh+RNQPhZN0yOnlyP8O8KnM\\nPBtYCqyIiHOBrwK3ZeYvAT8CrhpemZKaNmv4s2Pqipg51V8CnwL+sZp/N3DJUCqUNBQ9feePiLHq\\n9tx7gUeBbcCPM/NAtchOYFHNuqsjYiIiJvbtn2yiZkkN6Cn8mTmZmUuBk4DlwC/3uoHMXJOZ45k5\\nPn/eWJ9lSmra+zrbn5k/Bh4DPg58KCKmxgacBOxquDZJQ9TL2f75EfGhavqDwGeALXTeBH6nWuxK\\n4KFhFSmpeZGZ3ReIOIvOCb0xOm8W92XmVyLiVOB7wFzgv4DPZeY7s7zWPuCV6ukJwBuDld+oQ6ke\\na5mZtcxsei2/mJnze1lp1vAPS0RMZOZ4KxufwaFUj7XMzFpm1m8tXuEnFcrwS4VqM/xrWtz2TA6l\\neqxlZtYys75qae07v6R2+bFfKlQr4Y+IFRHxUjUi8Po2aphWy46IeD4ino2IiRFve21E7I2IzdPm\\nzY2IRyPi5eqx/qaCw6/lxojYVe2bZyNi5YhqWRwRj0XEi9VI0i9W80e+b7rUMvJ90/gI28wc6R+d\\n6wW2AacCRwHPAWeOuo5p9ewATmhp258AzgE2T5v3NeD6avp64Kst1nIj8Gct7JeFwDnV9HHA/wBn\\ntrFvutQy8n0DBHBsNT0HeBI4F7gPuLya/w3gT3p5vTaO/MuBrZm5PTN/RudCoVUt1NG6zHwcePOg\\n2avoXFQFIxwtWVNLKzJzd2Y+U02/TeeK0kW0sG+61DJy2dHYCNs2wr8IeHXa89oRgSOSwCMR8XRE\\nrG6xjikLMnN3Nf06sKDNYoBrI2JT9bVgJF9BpouIJcAyOke5VvfNQbVAC/tmkBG2B/OEH5yfmecA\\nvwFcExGfaLugKdn5HNdmd8ztwGl0fsRlN3DLKDceEccC9wPXZeZb09tGvW9mqKWVfZMDjLA9WBvh\\n3wUsnva81RGBmbmretwLPEhnh7ZpT0QsBKge97ZVSGbuqf6xvQfcwQj3TUTMoRO2ezLzgWp2K/tm\\nplra3DfV9gceYdtG+J8CTq/OUB4FXA6sa6EOIuKYiDhuahq4CNjcfa2hW0dnlCS0PFpyKmiVSxnR\\nvomIAO4EtmTmrdOaRr5v6mppY980PsJ2lGcrp521XEnnrOk24M/bqKGq41Q6vQ3PAS+MuhbgXjof\\nGd+l813tKmAesAF4GfhXYG6LtXwHeB7YRCd4C0dUy/l0PtJvAp6t/la2sW+61DLyfQOcRWcE7SY6\\nbzZ/Oe3f8UZgK/AD4OheXs8r/KRCecJPKpThlwpl+KVCGX6pUIZfKpThlwpl+KVCGX6pUP8HziGw\\nVyOGQaIAAAAASUVORK5CYII=\\n\",\n      \"text/plain\": [\n       \"<matplotlib.figure.Figure at 0x7f6223ab1208>\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"data\": {\n      \"image/png\": \"iVBORw0KGgoAAAANSUhEUgAAAP8AAAD8CAYAAAC4nHJkAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAADLJJREFUeJzt3X/IXYV9x/H3t0+jHRqoiVlIY7ZU\\nJxul0yjPQkdD6dpZUylExygKKxnIUroJCh1bsLC6/WXLVEbZ3OIMDcXZuqkzMLc0FUEcRX3iYoxm\\nmz8WMTEmMWnR/TFrHr/7457A0/Dc57k+99x7knzfL3i4555z7j0fDvncc8+v3MhMJNXzoa4DSOqG\\n5ZeKsvxSUZZfKsryS0VZfqkoyy8VZfmloiy/VNSHh3lxRKwH/gqYAP4+M2+fa/4Ll0zk6lWLhlmk\\npDnsf/093jo+HYPMu+DyR8QE8NfAVcAB4JmI2J6ZL/Z7zepVi3h6x6qFLlLSPNZe/frA8w7ztX8t\\n8HJmvpqZPwO+D2wY4v0kjdEw5V8JzPyYOdCMk3QGGPkBv4jYFBFTETF19Nj0qBcnaUDDlP8gMHMH\\n/qJm3M/JzC2ZOZmZk8uWTgyxOEltGqb8zwCXRsTHI+Ic4HpgezuxJI3ago/2Z+aJiLgJ2EHvVN/W\\nzHyhtWSSRmqo8/yZ+SjwaEtZJI2RV/hJRVl+qSjLLxVl+aWiLL9UlOWXirL8UlGWXyrK8ktFWX6p\\nKMsvFWX5paIsv1SU5ZeKsvxSUZZfKsryS0VZfqkoyy8VZfmloiy/VJTll4qy/FJRll8qyvJLRVl+\\nqaihfq4rIvYD7wDTwInMnGwjlKTRG6r8jd/KzLdaeB9JY+TXfqmoYcufwA8jYldEbGojkKTxGPZr\\n/7rMPBgRvwjsjIj/zMwnZs7QfChsAvillW3sZUhqw1Bb/sw82DweAR4G1s4yz5bMnMzMyWVLJ4ZZ\\nnKQWLbj8EXFeRCw+OQx8AdjbVjBJozXM9/DlwMMRcfJ9/iEz/62VVJJGbsHlz8xXgctbzCJpjDzV\\nJxVl+aWiLL9UlOWXirL8UlFn9SV3k9/8Wt9py3/0Rt9p//Lvj4wijnRaccsvFWX5paIsv1SU5ZeK\\nsvxSUZZfKuqsPtX3kePv95124n9e6zvt6o+tGUUcnSZ2vLG76winBbf8UlGWXyrK8ktFWX6pKMsv\\nFWX5paLO6lN9T37n7/pP/E7/Sev2/E77YdS6o7uW9522+hs/HmOSM5Nbfqkoyy8VZfmloiy/VJTl\\nl4qy/FJR857qi4itwJeAI5n5yWbcEuAHwGpgP/DlzPzJ6GKO15OXPdR1BA1gHZ6SHcYgW/7vAutP\\nGbcZeCwzLwUea55LOoPMW/7MfAI4fsroDcC2ZngbcG3LuSSN2EL3+Zdn5qFm+E16P9ct6Qwy9AG/\\nzEwg+02PiE0RMRURU0ePTQ+7OEktWWj5D0fECoDm8Ui/GTNzS2ZOZubksqUTC1ycpLYttPzbgY3N\\n8EbA37eSzjDzlj8i7gd+DPxqRByIiBuB24GrIuIl4Leb55LOIPOe58/MG/pM+nzLWSSNkVf4SUVZ\\nfqkoyy8VZfmloiy/VJTll4qy/FJRll8qyvJLRVl+qSjLLxVl+aWizurf6tPZ7bz1r3Yd4Yzmll8q\\nyvJLRVl+qSjLLxVl+aWiLL9UlOWXirL8UlGWXyrK8ktFWX6pKMsvFWX5paLmvasvIrYCXwKOZOYn\\nm3G3AX8AHG1muzUzHx1VSOmD2vHG7q4jnPYG2fJ/F1g/y/i7MnNN82fxpTPMvOXPzCeA42PIImmM\\nhtnnvyki9kTE1oi4oLVEksZioeW/G7gEWAMcAu7oN2NEbIqIqYiYOnpseoGLk9S2BZU/Mw9n5nRm\\nvg/cA6ydY94tmTmZmZPLlk4sNKekli2o/BGxYsbT64C97cSRNC6DnOq7H/gscGFEHAC+CXw2ItYA\\nCewHvjrCjJJGYN7yZ+YNs4y+dwRZJI2RV/hJRVl+qSjLLxVl+aWiLL9UlOWXiir7Q51Xf2xN1xHU\\nePeLv7Gg153LMy0nqcUtv1SU5ZeKsvxSUZZfKsryS0VZfqmosqf6WPvrXSco5UP/917faef+q6fs\\nuuCWXyrK8ktFWX6pKMsvFWX5paIsv1RU2VN9O/75e11H0JAu//YfzjHVH+qcj1t+qSjLLxVl+aWi\\nLL9UlOWXipq3/BGxKiIej4gXI+KFiLi5Gb8kInZGxEvN4wWjjyupLYOc6jsBfD0zn42IxcCuiNgJ\\n/D7wWGbeHhGbgc3An44uqvTznvuTv+k6whlt3i1/Zh7KzGeb4XeAfcBKYAOwrZltG3DtqEJKat8H\\n2uePiNXAFcBTwPLMPNRMehNY3moySSM1cPkj4nzgQeCWzHx75rTMTCD7vG5TRExFxNTRY9NDhZXU\\nnoHKHxGL6BX/vsx8qBl9OCJWNNNXAEdme21mbsnMycycXLZ0oo3MklowyNH+AO4F9mXmnTMmbQc2\\nNsMbgUfajydpVAY52v9p4CvA8xFx8m6JW4HbgQci4kbgNeDLo4koaRTmLX9mPglEn8mfbzeOpHHx\\nCj+pKMsvFWX5paIsv1SU5ZeKsvxSUZZfKsryS0VZfqkoyy8VZfmloiy/VJTll4qy/FJRll8qyvJL\\nRVl+qSjLLxVl+aWiLL9UlOWXirL8UlGWXyrK8ktFWX6pKMsvFWX5paIG+ZXeVRHxeES8GBEvRMTN\\nzfjbIuJgROxu/q4ZfVxJbRnkV3pPAF/PzGcjYjGwKyJ2NtPuysy/HF08SaMyyK/0HgIONcPvRMQ+\\nYOWog0karQ+0zx8Rq4ErgKeaUTdFxJ6I2BoRF/R5zaaImIqIqaPHpocKK6k9A5c/Is4HHgRuycy3\\ngbuBS4A19L4Z3DHb6zJzS2ZOZubksqUTLUSW1IaByh8Ri+gV/77MfAggMw9n5nRmvg/cA6wdXUxJ\\nbRvkaH8A9wL7MvPOGeNXzJjtOmBv+/EkjcogR/s/DXwFeD4idjfjbgVuiIg1QAL7ga+OJKGkkRjk\\naP+TQMwy6dH240gaF6/wk4qy/FJRll8qyvJLRVl+qSjLLxVl+aWiLL9UlOWXirL8UlGWXyrK8ktF\\nWX6pKMsvFWX5paIsv1SU5ZeKsvxSUZZfKsryS0VZfqkoyy8VZfmloiy/VJTll4qy/FJRg/xQ50ci\\n4umIeC4iXoiIP2/GfzwinoqIlyPiBxFxzujjSmrLIFv+d4HPZeblwBpgfUR8CvgWcFdm/grwE+DG\\n0cWU1LZ5y589/9s8XdT8JfA54J+a8duAa0eSUNJIDLTPHxETzc9zHwF2Aq8AP83ME80sB4CVfV67\\nKSKmImLq6LHpNjJLasFA5c/M6cxcA1wErAV+bdAFZOaWzJzMzMllSycWGFNS2z7Q0f7M/CnwOPCb\\nwEcj4sPNpIuAgy1nkzRCgxztXxYRH22GfwG4CthH70Pgd5vZNgKPjCqkpPZFZs49Q8Rl9A7oTdD7\\nsHggM/8iIi4Gvg8sAf4D+L3MfHee9zoKvNY8vRB4a7j4rTqd8phldmaZ3cwsv5yZywZ50bzlH5WI\\nmMrMyU4WPovTKY9ZZmeW2S00i1f4SUVZfqmoLsu/pcNlz+Z0ymOW2ZlldgvK0tk+v6Ru+bVfKqqT\\n8kfE+oj4r+aOwM1dZJiRZX9EPB8RuyNiaszL3hoRRyJi74xxSyJiZ0S81Dxe0GGW2yLiYLNudkfE\\nNWPKsioiHo+IF5s7SW9uxo993cyRZezrpvU7bDNzrH/0rhd4BbgYOAd4DvjEuHPMyLMfuLCjZX8G\\nuBLYO2Pct4HNzfBm4FsdZrkN+OMO1ssK4MpmeDHw38Anulg3c2QZ+7oBAji/GV4EPAV8CngAuL4Z\\n/7fA1wZ5vy62/GuBlzPz1cz8Gb0LhTZ0kKNzmfkEcPyU0RvoXVQFY7xbsk+WTmTmocx8thl+h94V\\npSvpYN3MkWXssqe1O2y7KP9K4PUZz/veETgmCfwwInZFxKYOc5y0PDMPNcNvAsu7DAPcFBF7mt2C\\nseyCzBQRq4Er6G3lOl03p2SBDtbNMHfYnsoDfrAuM68Evgj8UUR8putAJ2Xve1yXp2PuBi6h95+4\\nHALuGOfCI+J84EHglsx8e+a0ca+bWbJ0sm5yiDtsT9VF+Q8Cq2Y87/SOwMw82DweAR6mt0K7dDgi\\nVgA0j0e6CpKZh5t/bO8D9zDGdRMRi+iV7b7MfKgZ3cm6mS1Ll+umWf7Qd9h2Uf5ngEubI5TnANcD\\n2zvIQUScFxGLTw4DXwD2zv2qkdtO7y5J6PhuyZNFa1zHmNZNRARwL7AvM++cMWns66Zfli7WTet3\\n2I7zaOWMo5bX0Dtq+grwjS4yNDkupne24TnghXFnAe6n95XxPXr7ajcCS4HHgJeAHwFLOszyPeB5\\nYA+94q0YU5Z19L7S7wF2N3/XdLFu5sgy9nUDXEbvDto99D5s/mzGv+OngZeBfwTOHeT9vMJPKsoD\\nflJRll8qyvJLRVl+qSjLLxVl+aWiLL9UlOWXivp/Yqs8zOQZRQcAAAAASUVORK5CYII=\\n\",\n      \"text/plain\": [\n       \"<matplotlib.figure.Figure at 0x7f6223a3ee10>\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"data\": {\n      \"image/png\": \"iVBORw0KGgoAAAANSUhEUgAAAP8AAAD8CAYAAAC4nHJkAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAADF5JREFUeJzt3X/oXfV9x/Hna1/jHFWoiVnIYtao\\nkxUZNrovwaGUrp2tk0IURlFY8Q8hZdSh0P0hHWyuf9kxlf2xOeKUyrBWNxUDk1kngrQM9auLMTHb\\n/EFEs5jE2KL7Y239+t4f9wS+C98f1+89917j5/mAy/fcc86958Uhr3vu+ZWbqkJSe35p2gEkTYfl\\nlxpl+aVGWX6pUZZfapTllxpl+aVGWX6pUZZfatQpo7w4yRXAXwMzwN9X1a3LzX/W2pnasnnNKIuU\\ntIwDb/6Cd96dzzDzrrr8SWaAvwEuB94Cnkuyq6peXuo1Wzav4dnHN692kZJWsO0rbw497yhf+7cB\\nr1bV61X1c+AHwPYR3k/SBI1S/k3Awo+Zt7pxkk4CYz/gl2RHkrkkc0ePzY97cZKGNEr5DwILd+DP\\n7sb9P1W1s6pmq2p2/bqZERYnqU+jlP854Pwk5yQ5FbgG2NVPLEnjtuqj/VX1QZIbgMcZnOq7p6r2\\n9ZZM0liNdJ6/qh4DHuspi6QJ8go/qVGWX2qU5ZcaZfmlRll+qVGWX2qU5ZcaZfmlRll+qVGWX2qU\\n5ZcaZfmlRll+qVGWX2qU5ZcaZfmlRll+qVGWX2qU5ZcaZfmlRll+qVGWX2qU5ZcaZfmlRll+qVGW\\nX2rUSD/XleQA8D4wD3xQVbN9hJI0fiOVv/O7VfVOD+8jaYL82i81atTyF/DDJM8n2dFHIEmTMerX\\n/suq6mCSXwWeSPIfVfX0whm6D4UdAL++qY+9DEl9GGnLX1UHu79HgEeAbYvMs7OqZqtqdv26mVEW\\nJ6lHqy5/kk8lOeP4MPBlYG9fwSSN1yjfwzcAjyQ5/j7fr6p/6SWVpLFbdfmr6nXgcz1mkTRBnuqT\\nGmX5pUZZfqlRll9qlOWXGuUld/pY+8qvbV1y2uP/vXuCST553PJLjbL8UqMsv9Qoyy81yvJLjbL8\\nUqM81beI5U4vSZ8UbvmlRll+qVGWX2qU5ZcaZfmlRll+qVGe6lvEV/f9ZMlpD7z52xNMcnL5289+\\nf8lpF5562qre09Ou4+OWX2qU5ZcaZfmlRll+qVGWX2qU5ZcateKpviT3AF8FjlTVb3Xj1gIPAFuA\\nA8DXqmrp82MnmT8+841VTdPqTudpOobZ8n8PuOKEcTcDT1bV+cCT3XNJJ5EVy19VTwPvnjB6O3Bv\\nN3wvcFXPuSSN2Wr3+TdU1aFu+G0GP9ct6SQy8gG/qiqglpqeZEeSuSRzR4/Nj7o4ST1ZbfkPJ9kI\\n0P09stSMVbWzqmaranb9uplVLk5S31Zb/l3Add3wdcCj/cSRNCkrlj/J/cC/Ab+Z5K0k1wO3Apcn\\neQX4ve65pJPIiuf5q+raJSZ9qecskibIK/ykRll+qVGWX2qU5ZcaZfmlRll+qVGWX2qU5ZcaZfml\\nRll+qVGWX2qU5ZcaZfmlRll+qVGWX2qU5ZcaZfmlRll+qVGWX2qU5ZcaZfmlRll+qVGWX2qU5Zca\\nZfmlRll+qVHD/FbfPUmOJNm7YNwtSQ4m2d09rhxvTEl9G2bL/z3gikXG31FVW7vHY/3GkjRuK5a/\\nqp4G3p1AFkkTNMo+/w1J9nS7BWf2lkjSRKy2/HcC5wFbgUPAbUvNmGRHkrkkc0ePza9ycZL6tqry\\nV9Xhqpqvqg+Bu4Bty8y7s6pmq2p2/bqZ1eaU1LNVlT/JxgVPrwb2LjWvpI+nU1aaIcn9wBeAs5K8\\nBfw58IUkW4ECDgDfGGNGSWOwYvmr6tpFRt89hiySJsgr/KRGWX6pUZZfapTllxpl+aVGWX6pUZZf\\napTllxpl+aVGWX6pUZZfapTllxpl+aVGWX6pUZZfapTllxpl+aVGWX6pUZZfapTllxpl+aVGWX6p\\nUZZfapTllxpl+aVGWX6pUSuWP8nmJE8leTnJviQ3duPXJnkiySvd3zPHH1dSX4bZ8n8AfKuqLgAu\\nAb6Z5ALgZuDJqjofeLJ7LukksWL5q+pQVb3QDb8P7Ac2AduBe7vZ7gWuGldISf37SPv8SbYAFwHP\\nABuq6lA36W1gQ6/JJI3V0OVPcjrwEHBTVb23cFpVFVBLvG5Hkrkkc0ePzY8UVlJ/hip/kjUMin9f\\nVT3cjT6cZGM3fSNwZLHXVtXOqpqtqtn162b6yCypB8Mc7Q9wN7C/qm5fMGkXcF03fB3waP/xJI3L\\nKUPMcynwdeClJLu7cd8GbgUeTHI98AbwtfFElDQOK5a/qn4EZInJX+o3jqRJ8Qo/qVGWX2qU5Zca\\nZfmlRll+qVGWX2rUMOf5P5F+/L8fLjntO+dePMEk0nS45ZcaZfmlRll+qVGWX2qU5ZcaZfmlRjV7\\nqu/S05b+3DvlnM9MMIlWb/fKs2hJbvmlRll+qVGWX2qU5ZcaZfmlRll+qVHNnupbzj//2P+FXJ98\\nbvmlRll+qVGWX2qU5ZcaZfmlRll+qVHD/Erv5iRPJXk5yb4kN3bjb0lyMMnu7nHl+ONK6ssw5/k/\\nAL5VVS8kOQN4PskT3bQ7quqvxhdP0rgM8yu9h4BD3fD7SfYDm8YdTNJ4faR9/iRbgIuAZ7pRNyTZ\\nk+SeJGcu8ZodSeaSzB09Nj9SWEn9Gbr8SU4HHgJuqqr3gDuB84CtDL4Z3LbY66pqZ1XNVtXs+nUz\\nPUSW1Iehyp9kDYPi31dVDwNU1eGqmq+qD4G7gG3jiympb8Mc7Q9wN7C/qm5fMH7jgtmuBvb2H0/S\\nuAxztP9S4OvAS0mO/4+J3wauTbIVKOAA8I2xJJQ0FsMc7f8RkEUmPdZ/HEmT4hV+UqMsv9Qoyy81\\nyvJLjbL8UqMsv9Qoyy81yvJLjbL8UqMsv9Qoyy81yvJLjbL8UqMsv9Qoyy81yvJLjbL8UqMsv9Qo\\nyy81yvJLjbL8UqMsv9Qoyy81yvJLjbL8UqMsv9SoYX6o87QkzyZ5Mcm+JH/RjT8nyTNJXk3yQJJT\\nxx9XUl+G2fL/DPhiVX0O2ApckeQS4LvAHVX1G8BPgOvHF1NS31Ysfw38T/d0Tfco4IvAP3Xj7wWu\\nGktCSWMx1D5/kpnu57mPAE8ArwE/raoPulneAjYt8dodSeaSzB09Nt9HZkk9GKr8VTVfVVuBs4Ft\\nwGeHXUBV7ayq2aqaXb9uZpUxJfXtIx3tr6qfAk8BvwN8Oskp3aSzgYM9Z5M0RsMc7V+f5NPd8K8A\\nlwP7GXwI/EE323XAo+MKKal/qarlZ0guZHBAb4bBh8WDVfWdJOcCPwDWAv8O/GFV/WyF9zoKvNE9\\nPQt4Z7T4vfo45THL4syyuIVZPlNV64d50YrlH5ckc1U1O5WFL+LjlMcsizPL4labxSv8pEZZfqlR\\n0yz/zikuezEfpzxmWZxZFreqLFPb55c0XX7tlxo1lfInuSLJf3Z3BN48jQwLshxI8lKS3UnmJrzs\\ne5IcSbJ3wbi1SZ5I8kr398wpZrklycFu3exOcuWEsmxO8lSSl7s7SW/sxk983SyTZeLrpvc7bKtq\\nog8G1wu8BpwLnAq8CFww6RwL8hwAzprSsj8PXAzsXTDuL4Gbu+Gbge9OMcstwJ9MYb1sBC7uhs8A\\n/gu4YBrrZpksE183QIDTu+E1wDPAJcCDwDXd+L8D/miY95vGln8b8GpVvV5VP2dwodD2KeSYuqp6\\nGnj3hNHbGVxUBRO8W3KJLFNRVYeq6oVu+H0GV5RuYgrrZpksE1cDvd1hO43ybwLeXPB8yTsCJ6SA\\nHyZ5PsmOKeY4bkNVHeqG3wY2TDMMcEOSPd1uwUR2QRZKsgW4iMFWbqrr5oQsMIV1M8odtifygB9c\\nVlUXA78PfDPJ56cd6LgafI+b5umYO4HzGPwnLoeA2ya58CSnAw8BN1XVewunTXrdLJJlKuumRrjD\\n9kTTKP9BYPOC51O9I7CqDnZ/jwCPMFih03Q4yUaA7u+RaQWpqsPdP7YPgbuY4LpJsoZB2e6rqoe7\\n0VNZN4tlmea66ZY/8h220yj/c8D53RHKU4FrgF1TyEGSTyU54/gw8GVg7/KvGrtdDO6ShCnfLXm8\\naJ2rmdC6SRLgbmB/Vd2+YNLE181SWaaxbnq/w3aSRysXHLW8ksFR09eAP51Ghi7HuQzONrwI7Jt0\\nFuB+Bl8Zf8FgX+16YB3wJPAK8K/A2ilm+QfgJWAPg+JtnFCWyxh8pd8D7O4eV05j3SyTZeLrBriQ\\nwR20exh82PzZgn/HzwKvAv8I/PIw7+cVflKjPOAnNcryS42y/FKjLL/UKMsvNcryS42y/FKjLL/U\\nqP8DsIgi3QqylHQAAAAASUVORK5CYII=\\n\",\n      \"text/plain\": [\n       \"<matplotlib.figure.Figure at 0x7f6223d706a0>\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    }\n   ],\n   \"source\": [\n    \"params={}\\n\",\n    \"params['nbclasses'] = 3\\n\",\n    \"params['imagesize'] = 31\\n\",\n    \"params['prestime'] = 3\\n\",\n    \"params['interpresdelay'] = 2\\n\",\n    \"params['prestimetest'] = 2\\n\",\n    \"params['nbshots'] = 1\\n\",\n    \"\\n\",\n    \"params['nbsteps'] = params['nbshots'] * ((params['prestime'] + params['interpresdelay']) * params['nbclasses']) + params['prestimetest'] \\n\",\n    \"inputT = np.zeros((params['nbsteps'], 1, params['imagesize'], params['imagesize']))    #inputTensor, initially in numpy format...\\n\",\n    \"labelT = np.zeros((params['nbsteps'], 1, params['nbclasses']))      #labelTensor, initially in numpy format...\\n\",\n    \"\\n\",\n    \"patterns=[]\\n\",\n    \"cats = np.random.permutation(np.arange(len(imagedata)))[:params['nbclasses']]  # Which categories to use for this episode?\\n\",\n    \"testcat = random.choice(cats)\\n\",\n    \"\\n\",\n    \"# Inserting the character images and labels in the input tensor at the proper places\\n\",\n    \"location = 0\\n\",\n    \"for nc in range(params['nbshots']):\\n\",\n    \"    np.random.shuffle(cats)\\n\",\n    \"    for ii, catnum in enumerate(cats):\\n\",\n    \"        p = random.choice(imagedata[catnum])\\n\",\n    \"        for nr in range(rots[catnum]):\\n\",\n    \"            p = np.rot90(p)\\n\",\n    \"        p = skimage.transform.resize(p, (31, 31))\\n\",\n    \"        for nn in range(params['prestime']):\\n\",\n    \"            #numi =nc * (params['nbpatterns'] * (params['prestime']+params['interpresdelay'])) + ii * (params['prestime']+params['interpresdelay']) + nn\\n\",\n    \"\\n\",\n    \"            inputT[location][0][:][:] = p[:][:]\\n\",\n    \"            #labelT[location][0][cats.index(\\n\",\n    \"            location += 1\\n\",\n    \"        location += params['interpresdelay']\\n\",\n    \"\\n\",\n    \"p = random.choice(imagedata[testcat])\\n\",\n    \"for nr in range(rots[testcat]):\\n\",\n    \"    p = np.rot90(p)\\n\",\n    \"p = skimage.transform.resize(p, (31, 31))\\n\",\n    \"for nn in range(params['prestimetest']):\\n\",\n    \"    inputT[location][0][:][:] = p[:][:]\\n\",\n    \"    location += 1\\n\",\n    \"\\n\",\n    \"print(location, params['nbsteps'], cats, rots[cats])\\n\",\n    \"\\n\",\n    \"plt.figure()\\n\",\n    \"for x in (1, 1+4, 1+9, 1+14):\\n\",\n    \"    plt.figure()\\n\",\n    \"    plt.imshow(inputT[x][0])\\n\",\n    \"plt.figure()\\n\",\n    \"plt.imshow(inputT[-1][0])\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 126,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"<matplotlib.image.AxesImage at 0x7f62245185c0>\"\n      ]\n     },\n     \"execution_count\": 126,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    },\n    {\n     \"data\": {\n      \"image/png\": \"iVBORw0KGgoAAAANSUhEUgAAAP8AAAD8CAYAAAC4nHJkAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAADkJJREFUeJzt3W2MXOV5h/HrzsZAGlDBxrFc49ZA\\n3CY0CoZu3ZAgQkABx40KNBUCtRGRnBgRQKCkUi2iNrT0A2kL9C3QmtqNFRESWkD4gym4iNZCRYaF\\nGOOXJrzECIzxGhME/VASL3c/zLG0tfbMDjNnZtZ+rp+0mjPnOS+3jv2fM3Oe88xEZiKpPO8bdgGS\\nhsPwS4Uy/FKhDL9UKMMvFcrwS4Uy/FKhDL9UKMMvFer9vawcEcuAvwFGgH/KzJvbLX/i7JFctHBW\\nL7uU1Maul3/O629MRCfLdh3+iBgBvg18FngFeDIi1mfmjrp1Fi2cxRMPLex2l5KmsfTClztetpe3\\n/UuB5zPzxcz8GfB94KIetidpgHoJ/wJg8svMK9U8SYeBvl/wi4iVETEWEWP79k/0e3eSOtRL+HcD\\nkz/An1TN+38yc3Vmjmbm6Nw5Iz3sTlKTegn/k8DiiDg5Io4CLgPWN1OWpH7r+mp/Zh6IiGuAh2h1\\n9a3NzO2NVSapr3rq58/MDcCGhmqRNEDe4ScVyvBLhTL8UqEMv1Qowy8VyvBLhTL8UqEMv1Qowy8V\\nyvBLhTL8UqEMv1Qowy8VyvBLhTL8UqEMv1Qowy8VyvBLhTL8UqEMv1Qowy8VyvBLhTL8UqEMv1Qo\\nwy8VyvBLherp57oiYhfwNjABHMjM0SaKktR/PYW/8pnMfL2B7UgaIN/2S4XqNfwJPBwRT0XEyiYK\\nkjQYvb7tPzszd0fEh4CNEfHfmblp8gLVi8JKgF9e0MSnDElN6OnMn5m7q8dx4H5g6RTLrM7M0cwc\\nnTtnpJfdSWpQ1+GPiA9GxHEHp4ELgG1NFSapv3p5Hz4PuD8iDm7ne5n5b41UNQAX/tKSrtZ76NUt\\nDVciDUfX4c/MF4HTG6xF0gDZ1ScVyvBLhTL8UqEMv1Qowy8VylvudNha/qPltW0Tn3m1q22OHP+L\\ntW0bdvxnV9ucqTzzS4Uy/FKhDL9UKMMvFcrwS4Uy/FKh7OobkHO//JXatqM3PNn4/t753G/Wtv3H\\nmjsb31+/LD/t07VtE2/Wd+cdOP83atv+ds3f17Z9bdFZtW3tRoIejqM9PfNLhTL8UqEMv1Qowy8V\\nyvBLhTL8UqGK7epb8eOf1Lat+dWTa9u6/eLPo6nvzut2JNnyT/9u/f4ebL77EOCyn5xX2/bqX364\\ntm3T7atr23798d+vbTvpze21bftX1HfLjd10R20bfKBNWzk880uFMvxSoQy/VCjDLxXK8EuFMvxS\\noSIz2y8QsRb4PDCemR+r5s0GfgAsAnYBl2bmT6fb2ejpx+QTDy3sseThWn7GBY1vc8MPH258m912\\nSf71rv9q2379ok92td12o95m0u8mnnLvlbVti6/dPNBaurH0wpcZe+Z/o5NlOznzfwdYdsi8VcAj\\nmbkYeKR6LukwMm34M3MT8MYhsy8C1lXT64CLG65LUp91+5l/XmbuqaZfo/Vz3ZIOIz1f8MvWRYPa\\nCwcRsTIixiJibN/+iV53J6kh3YZ/b0TMB6gex+sWzMzVmTmamaNz54x0uTtJTes2/OuBK6rpK4AH\\nmilH0qBMO6ovIu4GzgVOjIhXgG8CNwP3RMQK4CXg0n4WOZP0o1uuH8a/Wt8l96Hb67vzuu3Km86v\\n/fNVtW2LeLy2rX3X4y/0UNHUXvzCP9a2XXhtd12SM9W04c/My2uazm+4FkkD5B1+UqEMv1Qowy8V\\nyvBLhTL8UqGK/QLPI1277rx2fmfH/rbtVx//cm1bu9F5i75R353XzkePar47Ty2e+aVCGX6pUIZf\\nKpThlwpl+KVCGX6pUHb1Hcb+/PWPdLXe+07/aG3b1cff3W05M+ZLLNUZz/xSoQy/VCjDLxXK8EuF\\nMvxSoQy/VCi7+ma487705dq2WQ+PdbXNBx/svjtPRw7P/FKhDL9UKMMvFcrwS4Uy/FKhDL9UqE5+\\nq28t8HlgPDM/Vs27EfgKsK9a7IbM3NCvIget3RdRDtosuuvOc4SdptPJmf87wLIp5t+WmUuqvyMm\\n+FIppg1/Zm4C3hhALZIGqJfP/NdExNaIWBsRJzRWkaSB6Db8dwCnAkuAPcAtdQtGxMqIGIuIsX37\\nJ7rcnaSmdRX+zNybmROZ+S5wJ7C0zbKrM3M0M0fnzhnptk5JDesq/BExf9LTS4BtzZQjaVA66eq7\\nGzgXODEiXgG+CZwbEUuABHYBV/axRkl9MG34M/PyKWav6UMtA9VtX/74Vz9Z23agD78p+ezXbm9+\\noxLe4ScVy/BLhTL8UqEMv1Qowy8VyvBLhfLbe6fw3N/9Vm3bi1+w6+1Idsq99besLGbzACvpP8/8\\nUqEMv1Qowy8VyvBLhTL8UqEMv1Qou/o0o7Ubfbl/xVm1bWM33dHV/hZfe2R157XjmV8qlOGXCmX4\\npUIZfqlQhl8qlOGXCmVXn4rT7Ze33v7SY21aj+2umCHyzC8VyvBLhTL8UqEMv1Qowy8VatrwR8TC\\niHg0InZExPaIuK6aPzsiNkbEc9XjCf0vV1JTOunqOwB8PTOfjojjgKciYiPwJeCRzLw5IlYBq4A/\\n6l+pg9NuZNem365f75xj+lCMas1Z83ht24VruuvO23VT/UjBU2dt6WqbM9W0Z/7M3JOZT1fTbwM7\\ngQXARcC6arF1wMX9KlJS897TZ/6IWAScAWwG5mXmnqrpNWBeo5VJ6quOwx8RxwL3Atdn5luT2zIz\\ngaxZb2VEjEXE2L79Ez0VK6k5HYU/ImbRCv5dmXlfNXtvRMyv2ucD41Otm5mrM3M0M0fnzhlpomZJ\\nDejkan8Aa4CdmXnrpKb1wBXV9BXAA82XJ6lfOrna/yngi8CzEXHwcucNwM3APRGxAngJuLQ/JUrq\\nh2nDn5mPAVHTfH6z5QzOQ68eWd02R6p2/06jf3xVV9ts/+We5fy/8A4/qVCGXyqU4ZcKZfilQhl+\\nqVCGXyqUX+Cpw1a3v8enFs/8UqEMv1Qowy8VyvBLhTL8UqEMv1Qowy8VyvBLhTL8UqEMv1Qowy8V\\nyvBLhTL8UqEMv1Qowy8VyvBLhTL8UqEMv1Qowy8VyvBLherkV3oXRsSjEbEjIrZHxHXV/BsjYndE\\nbKn+lve/XElN6eTbew8AX8/MpyPiOOCpiNhYtd2WmX/Vv/Ik9Usnv9K7B9hTTb8dETuBBf0uTFJ/\\nvafP/BGxCDgD2FzNuiYitkbE2og4oWadlRExFhFj+/ZP9FSspOZ0HP6IOBa4F7g+M98C7gBOBZbQ\\nemdwy1TrZebqzBzNzNG5c0YaKFlSEzoKf0TMohX8uzLzPoDM3JuZE5n5LnAnsLR/ZUpqWidX+wNY\\nA+zMzFsnzZ8/abFLgG3NlyepXzq52v8p4IvAsxGxpZp3A3B5RCwBEtgFXNmXCiX1RSdX+x8DYoqm\\nDc2XI2lQvMNPKpThlwpl+KVCGX6pUIZfKpThlwpl+KVCGX6pUIZfKpThlwpl+KVCGX6pUIZfKpTh\\nlwpl+KVCGX6pUIZfKpThlwpl+KVCGX6pUIZfKpThlwpl+KVCGX6pUIZfKpThlwrVyQ91HhMRT0TE\\nMxGxPSL+tJp/ckRsjojnI+IHEXFU/8uV1JROzvzvAOdl5unAEmBZRHwC+BZwW2Z+GPgpsKJ/ZUpq\\n2rThz5b/qZ7Oqv4SOA/412r+OuDivlQoqS86+swfESPVz3OPAxuBF4A3M/NAtcgrwIKadVdGxFhE\\njO3bP9FEzZIa0FH4M3MiM5cAJwFLgY90uoPMXJ2Zo5k5OnfOSJdlSmrae7ran5lvAo8CZwHHR8T7\\nq6aTgN0N1yapjzq52j83Io6vpj8AfBbYSetF4Peqxa4AHuhXkZKaF5nZfoGIj9O6oDdC68Xinsz8\\ns4g4Bfg+MBv4IfAHmfnONNvaB7xUPT0ReL238hs1k+qxlqlZy9Qm1/IrmTm3k5WmDX+/RMRYZo4O\\nZedTmEn1WMvUrGVq3dbiHX5SoQy/VKhhhn/1EPc9lZlUj7VMzVqm1lUtQ/vML2m4fNsvFWoo4Y+I\\nZRHxo2pE4Kph1DCpll0R8WxEbImIsQHve21EjEfEtknzZkfExoh4rno8YYi13BgRu6tjsyUilg+o\\nloUR8WhE7KhGkl5XzR/4sWlTy8CPTeMjbDNzoH+07hd4ATgFOAp4Bjht0HVMqmcXcOKQ9n0OcCaw\\nbdK8vwBWVdOrgG8NsZYbgT8cwnGZD5xZTR8H/Bg4bRjHpk0tAz82QADHVtOzgM3AJ4B7gMuq+f8A\\nXNXJ9oZx5l8KPJ+ZL2bmz2jdKHTREOoYuszcBLxxyOyLaN1UBQMcLVlTy1Bk5p7MfLqafpvWHaUL\\nGMKxaVPLwGVLYyNshxH+BcDLk57XjggckAQejoinImLlEOs4aF5m7qmmXwPmDbMY4JqI2Fp9LBjI\\nR5DJImIRcAats9xQj80htcAQjk0vI2wP5QU/ODszzwQ+B1wdEecMu6CDsvU+bpjdMXcAp9L6Epc9\\nwC2D3HlEHAvcC1yfmW9Nbhv0sZmilqEcm+xhhO2hhhH+3cDCSc+HOiIwM3dXj+PA/bQO6DDtjYj5\\nANXj+LAKycy91X+2d4E7GeCxiYhZtMJ2V2beV80eyrGZqpZhHptq/z2PsB1G+J8EFldXKI8CLgPW\\nD6EOIuKDEXHcwWngAmBb+7X6bj2tUZIw5NGSB4NWuYQBHZuICGANsDMzb53UNPBjU1fLMI5N4yNs\\nB3m1ctJVy+W0rpq+AHxjGDVUdZxCq7fhGWD7oGsB7qb1lvHntD6rrQDmAI8AzwH/DsweYi3fBZ4F\\nttIK3vwB1XI2rbf0W4Et1d/yYRybNrUM/NgAH6c1gnYrrRebP5n0//gJ4HngX4CjO9med/hJhfKC\\nn1Qowy8VyvBLhTL8UqEMv1Qowy8VyvBLhTL8UqH+D3ePpxaoEdRyAAAAAElFTkSuQmCC\\n\",\n      \"text/plain\": [\n       \"<matplotlib.figure.Figure at 0x7f6224614080>\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    }\n   ],\n   \"source\": [\n    \"z = np.zeros((1, 31, 31))\\n\",\n    \"z[0][:][:] = patterns[1][:][:]\\n\",\n    \"plt.imshow(z[0])\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 140,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"ename\": \"AttributeError\",\n     \"evalue\": \"module 'numpy' has no attribute 'find'\",\n     \"output_type\": \"error\",\n     \"traceback\": [\n      \"\\u001b[0;31m---------------------------------------------------------------------------\\u001b[0m\",\n      \"\\u001b[0;31mAttributeError\\u001b[0m                            Traceback (most recent call last)\",\n      \"\\u001b[0;32m<ipython-input-140-f78f50f5f959>\\u001b[0m in \\u001b[0;36m<module>\\u001b[0;34m()\\u001b[0m\\n\\u001b[1;32m      1\\u001b[0m \\u001b[0mcats\\u001b[0m\\u001b[0;34m\\u001b[0m\\u001b[0m\\n\\u001b[0;32m----> 2\\u001b[0;31m \\u001b[0mnp\\u001b[0m\\u001b[0;34m.\\u001b[0m\\u001b[0mfind\\u001b[0m\\u001b[0;34m(\\u001b[0m\\u001b[0mcats\\u001b[0m\\u001b[0;34m==\\u001b[0m\\u001b[0;36m414\\u001b[0m\\u001b[0;34m)\\u001b[0m\\u001b[0;34m\\u001b[0m\\u001b[0m\\n\\u001b[0m\",\n      \"\\u001b[0;31mAttributeError\\u001b[0m: module 'numpy' has no attribute 'find'\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"cats\\n\",\n    \"np.find(cats==414)\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": []\n  }\n ],\n \"metadata\": {\n  \"kernelspec\": {\n   \"display_name\": \"Python 3\",\n   \"language\": \"python\",\n   \"name\": \"python3\"\n  },\n  \"language_info\": {\n   \"codemirror_mode\": {\n    \"name\": \"ipython\",\n    \"version\": 3\n   },\n   \"file_extension\": \".py\",\n   \"mimetype\": \"text/x-python\",\n   \"name\": \"python\",\n   \"nbconvert_exporter\": \"python\",\n   \"pygments_lexer\": \"ipython3\",\n   \"version\": \"3.5.2\"\n  }\n },\n \"nbformat\": 4,\n \"nbformat_minor\": 2\n}\n"
  },
  {
    "path": "omniglot/README.md",
    "content": "# Omniglot experiment\n\nThis code performs the Omniglot task (fast learning of image-label mappings).\n\nTo run this code, you must download [the Python version of the Omniglot dataset](https://github.com/brendenlake/omniglot), and move the `omniglot-master` directory inside this directory. You will also need the scikit-image library (in addition to PyTorch).\n\nTo reproduce the results shown in the paper:\n\n```\npython3 omniglot.py --nbclasses 5  --nbiter 5000000 --rule oja --activ tanh --steplr 1000000 --prestime 1 --prestimetest 1 --gamma .666 --alpha free --lr 3e-5 \n\n```\n"
  },
  {
    "path": "omniglot/omniglot.py",
    "content": "# Differentiable plasticity: Omniglot task.\n\n# Copyright (c) 2018 Uber Technologies, Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#    http://www.apache.org/licenses/LICENSE-2.0\n#\n#    Unless required by applicable law or agreed to in writing, software\n#    distributed under the License is distributed on an \"AS IS\" BASIS,\n#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n#    See the License for the specific language governing permissions and\n#    limitations under the License.\n\n\n# You MUST download the Python version of the Omniglot dataset\n# (https://github.com/brendenlake/omniglot), and move the 'omniglot-master'\n# directory inside this directory.\n\n# To get the results shown in the paper:\n# python3 omniglot.py --nbclasses 5  --nbiter 5000000 --rule oja --activ tanh --steplr 1000000 --prestime 1 --gamma .666 --alpha free --lr 3e-5 \n\n# Alternative (using a shared, though still learned alpha across all connections): \n# python3 omniglot.py --nbclasses 5  --nbiter 5000000 --activ tanh --steplr 1000000 --prestime 1 --gamma 0.3 --lr 1e-4 --alpha yoked \n\n# Note that this code uses click rather than argparse for command-line\n# parameter handling. I won't do that again.\n\nimport pdb \nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\nimport click\nimport numpy as np\nfrom numpy import random\nimport torch.nn.functional as F\nfrom torch import optim\nfrom torch.optim import lr_scheduler\nimport random\nimport sys\nimport pickle\nimport pdb\nimport time\nimport skimage\nfrom skimage import transform\nfrom skimage import io\nimport os\nimport platform\n\nimport numpy as np\nimport glob\n\n\n\n\n\nnp.set_printoptions(precision=4)\ndefaultParams = {\n    'activ': 'tanh',    # 'tanh' or 'selu'\n    #'plastsize': 200,\n    'rule': 'hebb',     # 'hebb' or 'oja'\n    'alpha': 'free',   # 'free' of 'yoked' (if the latter, alpha is a single scalar learned parameter, shared across all connection)\n    'steplr': 1e6,  # How often should we change the learning rate?\n    'nbclasses': 5,\n    'gamma': .666,  # The annealing factor of learning rate decay for Adam\n    'flare': 0,     # Whether or not the ConvNet has more features in higher channels\n    'nbshots': 1,  # Number of 'shots' in the few-shots learning\n    'prestime': 1,\n    'nbf' : 64,  # Number of features. 128 is better (unsurprisingly) but we keep 64 for fair comparison with other reports\n    'prestimetest': 1,\n    'ipd': 0,  # Inter-presentation delay \n    'imgsize': 31,    \n    'nbiter': 5000000,  \n    'lr': 3e-5, \n    'test_every': 500,\n    'save_every': 10000,\n    'rngseed':0\n}\nNBTESTCLASSES = 100\n\n\n\n\n#ttype = torch.FloatTensor;\nttype = torch.cuda.FloatTensor;\n\n\n# Generate the full list of inputs, labels, and the target label for an episode\ndef generateInputsLabelsAndTarget(params, imagedata, test=False):\n    #print((\"Input Boost:\", params['inputboost']))\n    #params['nbsteps'] = params['nbshots'] * ((params['prestime'] + params['ipd']) * params['nbclasses']) + params['prestimetest'] \n    inputT = np.zeros((params['nbsteps'], 1, 1, params['imgsize'], params['imgsize']))    #inputTensor, initially in numpy format... Note dimensions: number of steps x batchsize (always 1) x NbChannels (also 1) x h x w \n    labelT = np.zeros((params['nbsteps'], 1, params['nbclasses']))      #labelTensor, initially in numpy format...\n\n    patterns=[]\n    if test:\n        cats = np.random.permutation(np.arange(len(imagedata) - NBTESTCLASSES, len(imagedata)))[:params['nbclasses']]  # Which categories to use for this *testing* episode?\n    else:\n        cats = np.random.permutation(np.arange(len(imagedata) - NBTESTCLASSES))[:params['nbclasses']]  # Which categories to use for this *training* episode?\n    #print(\"Test is\", test, \", cats are\", cats)\n    #cats = np.array(range(params['nbclasses'])) + 10\n\n    cats = np.random.permutation(cats)\n    #print(cats)\n\n    # We show one picture of each category, with labels, then one picture of one of these categories as a test, without label\n    # But each of the categories may undergo rotation by 0, 90, 180 or 270deg, for augmenting the dataset\n    # NOTE: We randomly assign one rotation to all the possible categories, not just the ones selected for the episode - it makes the coding simpler\n    rots = np.random.randint(4, size=len(imagedata))\n\n    #rots.fill(0)\n\n    testcat = random.choice(cats) # select the class on which we'll test in this episode\n    unpermcats = cats.copy()      \n\n    # Inserting the character images and labels in the input tensor at the proper places\n    location = 0\n    for nc in range(params['nbshots']):\n        np.random.shuffle(cats)   # Presentations occur in random order\n        for ii, catnum in enumerate(cats):\n            #print(catnum)\n            p = random.choice(imagedata[catnum])\n            for nr in range(rots[catnum]):\n                p = np.rot90(p)\n            p = skimage.transform.resize(p, (31, 31))\n            for nn in range(params['prestime']):\n                #numi =nc * (params['nbclasses'] * (params['prestime']+params['ipd'])) + ii * (params['prestime']+params['ipd']) + nn\n\n                inputT[location][0][0][:][:] = p[:][:]\n                labelT[location][0][np.where(unpermcats == catnum)] = 1 # The (one-hot) label is the position of the category number in the original (unpermuted) list\n                #if nn == 0:\n                #    print(labelT[location][0])\n                location += 1\n            location += params['ipd']\n\n    # Inserting the test character\n    p = random.choice(imagedata[testcat])\n    for nr in range(rots[testcat]):\n        p = np.rot90(p)\n    p = skimage.transform.resize(p, (31, 31))\n    for nn in range(params['prestimetest']):\n        inputT[location][0][0][:][:] = p[:][:]\n        location += 1\n        \n    # Generating the test label\n    testlabel = np.zeros(params['nbclasses'])\n    testlabel[np.where(unpermcats == testcat)] = 1\n    #print(testcat, testlabel)\n\n    #pdb.set_trace()\n        \n    \n    assert(location == params['nbsteps'])\n\n    inputT = torch.from_numpy(inputT).type(ttype)  # Convert from numpy to pytorch Tensor\n    labelT = torch.from_numpy(labelT).type(ttype)\n    targetL = torch.from_numpy(testlabel).type(ttype)\n\n    return inputT, labelT, targetL\n\n\n\nclass Network(nn.Module):\n    def __init__(self, params):\n        super(Network, self).__init__()\n        self.rule = params['rule']\n        if params['flare'] == 1:\n            self.cv1 = torch.nn.Conv2d(1, params['nbf'] //4 , 3, stride=2).cuda()\n            self.cv2 = torch.nn.Conv2d(params['nbf'] //4 , params['nbf'] //4 , 3, stride=2).cuda()\n            self.cv3 = torch.nn.Conv2d(params['nbf'] //4, params['nbf'] //2, 3, stride=2).cuda()\n            self.cv4 = torch.nn.Conv2d(params['nbf'] //2,  params['nbf'], 3, stride=2).cuda()\n        else:\n            self.cv1 = torch.nn.Conv2d(1, params['nbf'] , 3, stride=2).cuda()\n            self.cv2 = torch.nn.Conv2d(params['nbf'] , params['nbf'] , 3, stride=2).cuda()\n            self.cv3 = torch.nn.Conv2d(params['nbf'] , params['nbf'] , 3, stride=2).cuda()\n            self.cv4 = torch.nn.Conv2d(params['nbf'] ,  params['nbf'], 3, stride=2).cuda()\n        \n        # Alternative architecture: have a separate layer of\n        # plastic weights between the embedding and the output. We don't use\n        # this in the paper.\n        #self.conv2plast = torch.nn.Linear(params['nbf'], params['plastsize']).cuda()\n\n        # Notice that the vectors are row vectors, and the matrices are transposed wrt the usual order, following apparent pytorch conventions\n        # Each *column* of w targets a single output neuron\n        \n        self.w =  torch.nn.Parameter((.01 * torch.randn(params['nbf'], params['nbclasses'])).cuda(), requires_grad=True)\n        #self.w =  torch.nn.Parameter((.01 * torch.rand(params['plastsize'], params['nbclasses'])).cuda(), requires_grad=True)\n        if params['alpha'] == 'free':\n            self.alpha =  torch.nn.Parameter((.01 * torch.rand(params['nbf'], params['nbclasses'])).cuda(), requires_grad=True) # Note: rand rather than randn (all positive)\n        elif params['alpha'] == 'yoked':\n            self.alpha =  torch.nn.Parameter((.01 * torch.ones(1)).cuda(), requires_grad=True)\n        else :\n            raise ValueError(\"Must select a value for alpha ('free' or 'yoked')\")\n        self.eta = torch.nn.Parameter((.01 * torch.ones(1)).cuda(), requires_grad=True)  # Everyone has the same eta\n        self.params = params\n\n    def forward(self, inputx, inputlabel, hebb):\n        if self.params['activ'] == 'selu':\n            activ = F.selu(self.cv1(inputx))\n            activ = F.selu(self.cv2(activ))\n            activ = F.selu(self.cv3(activ))\n            activ = F.selu(self.cv4(activ))\n        elif self.params['activ'] == 'relu':\n            activ = F.relu(self.cv1(inputx))\n            activ = F.relu(self.cv2(activ))\n            activ = F.relu(self.cv3(activ))\n            activ = F.relu(self.cv4(activ))\n        elif self.params['activ'] == 'tanh':\n            activ = F.tanh(self.cv1(inputx))\n            activ = F.tanh(self.cv2(activ))\n            activ = F.tanh(self.cv3(activ))\n            activ = F.tanh(self.cv4(activ))\n        else:\n            raise ValueError(\"Parameter 'activ' is incorrect (must be tanh, relu or selu)\")\n        #activ = F.tanh(self.conv2plast(activ.view(1, self.params['nbf'])))\n        #activin = activ.view(-1, self.params['plastsize'])\n        activin = activ.view(-1, self.params['nbf'])\n        \n        if self.params['alpha'] == 'free':\n            activ = activin.mm( self.w + torch.mul(self.alpha, hebb)) + 1000.0 * inputlabel # The expectation is that a nonzero inputlabel will overwhelm the inputs and clamp the outputs\n        elif self.params['alpha'] == 'yoked':\n            activ = activin.mm( self.w + self.alpha * hebb) + 1000.0 * inputlabel # The expectation is that a nonzero inputlabel will overwhelm the inputs and clamp the outputs\n        activout = F.softmax( activ )\n        \n        if self.rule == 'hebb':\n            hebb = (1 - self.eta) * hebb + self.eta * torch.bmm(activin.unsqueeze(2), activout.unsqueeze(1))[0] # bmm used to implement outer product; remember activs have a leading singleton dimension\n        elif self.rule == 'oja':\n            hebb = hebb + self.eta * torch.mul((activin[0].unsqueeze(1) - torch.mul(hebb , activout[0].unsqueeze(0))) , activout[0].unsqueeze(0))  # Oja's rule. Remember that yin, yout are row vectors (dim (1,N)). Also, broadcasting!\n        else:\n            raise ValueError(\"Must select one learning rule ('hebb' or 'oja')\")\n\n        return activout, hebb\n\n    def initialZeroHebb(self):\n        #return Variable(torch.zeros(self.params['plastsize'], self.params['nbclasses']).type(ttype))\n        return Variable(torch.zeros(self.params['nbf'], self.params['nbclasses']).type(ttype))\n\n\n\n\ndef train(paramdict=None):\n    #params = dict(click.get_current_context().params)\n    print(\"Starting training...\")\n    params = {}\n    params.update(defaultParams)\n    if paramdict:\n        params.update(paramdict)\n    print(\"Passed params: \", params)\n    print(platform.uname())\n    sys.stdout.flush()\n    params['nbsteps'] = params['nbshots'] * ((params['prestime'] + params['ipd']) * params['nbclasses']) + params['prestimetest']  # Total number of steps per episode\n    suffix = \"W\"+\"\".join([str(x)+\"_\" if pair[0] is not 'nbsteps' and pair[0] is not 'rngseed' and pair[0] is not 'save_every' and pair[0] is not 'test_every' else '' for pair in sorted(zip(params.keys(), params.values()), key=lambda x:x[0] ) for x in pair])[:-1] + \"_rngseed_\" + str(params['rngseed'])   # Turning the parameters into a nice suffix for filenames\n    print(\"Suffix: \", suffix, \"length:\", len(suffix))\n    # Initialize random seeds (first two redundant?)\n    print(\"Setting random seeds\")\n    np.random.seed(params['rngseed']); random.seed(params['rngseed']); torch.manual_seed(params['rngseed'])\n    #print(click.get_current_context().params)\n\n\n    print(\"Loading Omniglot data...\")\n    imagedata = []\n    imagefilenames=[]\n    for basedir in ('./omniglot-master/python/images_background/', \n                    './omniglot-master/python/images_evaluation/'):\n        alphabetdirs = glob.glob(basedir+'*')\n        print(alphabetdirs[:4])\n        for alphabetdir in alphabetdirs:\n            chardirs = glob.glob(alphabetdir+\"/*\")\n            for chardir in chardirs:\n                chardata = []\n                charfiles = glob.glob(chardir+'/*')\n                for fn in charfiles:\n                    filedata = skimage.io.imread(fn) / 255.0 #plt.imread(fn)\n                    chardata.append(filedata)\n                imagedata.append(chardata)\n                imagefilenames.append(fn)\n    # imagedata is now a list of lists of numpy arrays \n    # imagedata[CharactertNumber][FileNumber] -> numpy(105,105)\n    np.random.shuffle(imagedata)  # Randomize order of characters \n    print(len(imagedata))\n    print(imagedata[1][2].shape)\n    print(\"Data loaded!\")\n\n\n\n    print(\"Initializing network\")\n    net = Network(params)\n    #net.cuda()\n    print (\"Shape of all optimized parameters:\", [x.size() for x in net.parameters()])\n    allsizes = [torch.numel(x.data.cpu()) for x in net.parameters()]\n    print (\"Size (numel) of all optimized elements:\", allsizes)\n    print (\"Total size (numel) of all optimized elements:\", sum(allsizes))\n\n    #total_loss = 0.0\n    print(\"Initializing optimizer\")\n    #optimizer = torch.optim.Adam([net.w, net.alpha, net.eta], lr=params['lr'])\n    optimizer = torch.optim.Adam(net.parameters(), lr=1.0*params['lr'])\n    #scheduler = torch.optim.lr_scheduler.ExponentialLR(optimizer, params['gamma']) \n    scheduler = torch.optim.lr_scheduler.StepLR(optimizer, gamma=params['gamma'], step_size=params['steplr'])\n\n\n\n    all_losses = []\n    all_losses_objective = []\n    lossbetweensaves = 0.0\n    lossbetweensavesprev = 1e+10\n    #test_every = 20\n    nowtime = time.time()\n    \n    print(\"Starting episodes...\")\n    sys.stdout.flush()\n    \n    for numiter in range(params['nbiter']):\n        \n        hebb = net.initialZeroHebb()\n        optimizer.zero_grad()\n\n        is_test_step = ((numiter+1) % params['test_every'] == 0)\n        inputs, labels, target = generateInputsLabelsAndTarget(params, imagedata, test=is_test_step)\n\n\n        for numstep in range(params['nbsteps']):\n            y, hebb = net(Variable(inputs[numstep], requires_grad=False), Variable(labels[numstep], requires_grad=False), hebb)\n\n        # Compute the loss\n        criterion = torch.nn.BCELoss()\n        loss = criterion(y[0], Variable(target, requires_grad=False))\n\n        # Compute the gradients\n        if is_test_step == False:\n            loss.backward()\n            \n            maxg = 0.0\n            scheduler.step()\n            optimizer.step()\n\n        lossnum = loss.data[0]\n        lossbetweensaves += lossnum\n        all_losses_objective.append(lossnum)\n        #total_loss  += lossnum\n\n        if is_test_step: # (numiter+1) % params['test_every'] == 0:\n\n            print(numiter, \"====\")\n            td = target.cpu().numpy()\n            yd = y.data.cpu().numpy()[0]\n            print(\"y: \", yd[:10])\n            print(\"target: \", td[:10])\n            #print(\"target: \", target.unsqueeze(0)[0][:10])\n            absdiff = np.abs(td-yd)\n            print(\"Mean / median / max abs diff:\", np.mean(absdiff), np.median(absdiff), np.max(absdiff))\n            print(\"Correlation (full / sign): \", np.corrcoef(td, yd)[0][1], np.corrcoef(np.sign(td), np.sign(yd))[0][1])\n            #print inputs[numstep]\n            previoustime = nowtime\n            nowtime = time.time()\n            print(\"Time spent on last\", params['test_every'], \"iters: \", nowtime - previoustime)\n            #total_loss /= params['test_every']\n            #print(\"Mean loss over last\", params['test_every'], \"iters:\", total_loss)\n            #all_losses.append(total_loss)\n            print(\"Loss on single withheld-data episode:\", lossnum)\n            all_losses.append(lossnum)\n            print (\"Eta: \", net.eta.data.cpu().numpy())\n            sys.stdout.flush()\n            #total_loss = 0\n\n\n        if (numiter+1) % params['save_every'] == 0:\n            print(\"Saving files...\")\n            lossbetweensaves /= params['save_every']\n            print(\"Average loss over the last\", params['save_every'], \"episodes:\", lossbetweensaves)\n            print(\"Alternative computation (should be equal):\", np.mean(all_losses_objective[-params['save_every']:]))\n            losslast100 = np.mean(all_losses_objective[-100:])\n            print(\"Average loss over the last 100 episodes:\", losslast100)\n            # Instability detection; useful for SELUs, which seem to be divergence-prone\n            # NOTE: highly experimental!\n            # Note that if we are unlucky enough to have diverged within the last 100 timesteps, this may not save us.\n            #if losslast100 > 2 * lossbetweensavesprev: \n            #    print(\"We have diverged ! Restoring last savepoint!\")\n            #    net.load_state_dict(torch.load('./torchmodel_'+suffix + '.txt'))\n            #else: # to \"print(\"Saved!\")\"\n            print(\"Saving local files...\")\n            localsuffix = suffix\n            if (numiter + 1) % 500000 == 0:\n                localsuffix = localsuffix + \"_\"+str(numiter+1)\n            with open('results_'+localsuffix+'.dat', 'wb') as fo:\n                pickle.dump(net.w.data.cpu().numpy(), fo)\n                pickle.dump(net.alpha.data.cpu().numpy(), fo)\n                pickle.dump(net.eta.data.cpu().numpy(), fo)\n                pickle.dump(all_losses, fo)\n                pickle.dump(params, fo)\n            with open('loss_'+localsuffix+'.txt', 'w') as thefile:\n                for item in all_losses:\n                    thefile.write(\"%s\\n\" % item)\n            torch.save(net.state_dict(), 'torchmodel_'+localsuffix+'.txt')\n            # # Uber-only \n            if os.path.isdir('/mnt/share/tmiconi'):\n                print(\"Transferring to NFS storage...\")\n                for fn in ['results_'+localsuffix+'.dat', 'loss_'+localsuffix+'.txt', 'torchmodel_'+localsuffix+'.txt']:\n                    result = os.system(\n                        'cp {} {}'.format(fn, '/mnt/share/tmiconi/omniglot-nfs/'+fn))\n                print(\"Done!\")\n            lossbetweensavesprev = lossbetweensaves\n            lossbetweensaves = 0\n            sys.stdout.flush()\n            sys.stderr.flush()\n\n\n\n@click.command()\n@click.option('--nbclasses', default=defaultParams['nbclasses'])\n@click.option('--alpha', default=defaultParams['alpha'])\n#@click.option('--plastsize', default=defaultParams['plastsize'])\n@click.option('--rule', default=defaultParams['rule'])\n@click.option('--gamma', default=defaultParams['gamma'])\n@click.option('--steplr', default=defaultParams['steplr'])\n@click.option('--activ', default=defaultParams['activ'])\n@click.option('--flare', default=defaultParams['flare'])\n@click.option('--nbshots', default=defaultParams['nbshots'])\n@click.option('--nbf', default=defaultParams['nbf'])\n@click.option('--prestime', default=defaultParams['prestime'])\n@click.option('--prestimetest', default=defaultParams['prestimetest'])\n@click.option('--ipd', default=defaultParams['ipd'])\n@click.option('--nbiter', default=defaultParams['nbiter'])\n@click.option('--lr', default=defaultParams['lr'])\n@click.option('--test_every', default=defaultParams['test_every'])\n@click.option('--save_every', default=defaultParams['save_every'])\n@click.option('--rngseed', default=defaultParams['rngseed'])\ndef main(nbclasses, alpha, rule, gamma, steplr, activ, flare, nbshots, nbf, prestime, prestimetest, ipd, nbiter, lr, test_every, save_every, rngseed):\n    train(paramdict=dict(click.get_current_context().params))\n    #print(dict(click.get_current_context().params))\n\nif __name__ == \"__main__\":\n    #train()\n    main()\n\n"
  },
  {
    "path": "omniglot/opus.docker",
    "content": "#tmiconi_omniglot\n#latest\n#.\n\n\nFROM localhost:5000/opus-deep-learning:master-test-2017_9_7_20_56_10\n\nRUN pip3 install scikit-image\nRUN pip3 install click\nRUN mkdir /home/work\nRUN mkdir /home/work/omniglot-master/\n\nCOPY ./*.py /home/work/\nADD ./omniglot-master /home/work/omniglot-master/\n\nENV LC_ALL C.UTF-8\nENV  LANG C.UTF-8\n"
  },
  {
    "path": "omniglot/plotresults.py",
    "content": "import numpy as np\nimport glob\nimport matplotlib.pyplot as plt\n\ngroupnames = glob.glob('./tmp/loss*rngseed_0.txt')\n#fnames = glob.glob('./tmp/loss_api_*.txt')\n#fnames = glob.glob('./tmp/loss_fixed_*.txt')\n\ndef mavg(x, N):\n  cumsum = np.cumsum(np.insert(x, 0, 0)) \n  return (cumsum[N:] - cumsum[:-N]) / N\n\nplt.ion()\n#plt.figure(figsize=(5,4))  # Smaller figure = relative larger fonts\nplt.figure()\n\nmaxl = 100\n\nfor numgroup, groupname in enumerate(groupnames):\n    g = groupname[:-6]+\"*\"\n    print(g)\n    fnames = glob.glob(g)\n    fulllosses=[]\n    losses=[]\n    lgts=[]\n    for fn in fnames:\n        if \"COPY\" in fn:\n            continue\n        if \"00.tx\" in fn:\n            continue\n        z = np.loadtxt(fn)\n        z = z[::10] # Decimation\n        #z = mavg(z, 100)\n        lgts.append(len(z))\n        fulllosses.append(z)\n    minlen = min(lgts)\n    \n    for z in fulllosses:\n        losses.append(z[:minlen])\n\n    losses = np.array(losses)\n    \n    meanl = np.mean(losses, axis=0)\n    stdl = np.std(losses, axis=0)\n\n    medianl = np.median(losses, axis=0)\n    q1l = np.percentile(losses, 25, axis=0)\n    q3l = np.percentile(losses, 75, axis=0)\n    \n    highl = np.max(losses, axis=0)\n    lowl = np.min(losses, axis=0)\n    #highl = meanl+stdl\n    #lowl = meanl-stdl\n\n    myls = '-'\n    if numgroup >= 8:\n        myls = '--'\n    xx = range(len(meanl))\n\n    # xticks and labels\n    if len(meanl) > maxl:\n        maxl = len(meanl)\n\n    #plt.plot(mavg(meanl, 100), label=g) #, color='blue')\n    #plt.fill_between(xx, lowl, highl,  alpha=.1)\n    #plt.fill_between(xx, q1l, q3l,  alpha=.3)\n    #plt.plot(meanl) #, color='blue')\n    plt.plot(mavg(medianl, 10), label=g, ls=myls) #, color='blue')  # mavg changes the number of points !\n    #plt.plot(mavg(q1l, 100), label=g, alpha=.3) #, color='blue')\n    #plt.plot(mavg(q3l, 100), label=g, alpha=.3) #, color='blue')\n    #plt.fill_between(xx, q1l, q3l,  alpha=.2)\n    #plt.plot(medianl, label=g) #, color='blue')\n\nplt.legend()\n#plt.xlabel('Loss (sum square diff. b/w final output and target)')\nplt.xlabel('Number of Episodes')\nplt.ylabel('Loss')\nxt = range(0, maxl, 100)\nxtl = [str(5000*i) for i in xt]  #5000 = 500 episode per loss saving, plus the decimation above\nplt.xticks(xt, xtl)\nplt.tight_layout()\n\n\n"
  },
  {
    "path": "omniglot/request.json",
    "content": "{\n    \"dockerImage\":\"tmiconi_omniglot\", \n    \"tag\":\"master-test-2018_6_22_10_40_5\", \n    \"name\":\"Exp7_OmniglotNoSepPlast_alpha_free_tanh_oja_lr3e-5_gamma0.666_NFS\",\n    \"cpus\":2.0,\n    \"cmdLine\":\"export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 \\u0026\\u0026  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/nvidia/bin:/opt/hadoop/latest/bin \\u0026\\u0026 export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/nvidia/lib64/  \\u0026\\u0026  export LC_ALL=C.UTF-8  \\u0026\\u0026   export LANG=C.UTF-8  \\u0026\\u0026  cd /home/work/ \\u0026\\u0026 python3 omniglot.py --nbclasses 5  --nbiter 5000000 --rule oja --activ tanh --steplr 1000000 --prestime 1 --prestimetest 1 --gamma .666 --alpha free --lr 3e-5 --rngseed {{mesos.instance}}\",\n    \"ramMB\":6000,\n    \"gpus\":1,\n    \"diskMB\":6000,\n    \"resourcePool\": \"/ailabs/p2/tmiconi\",\n    \"cluster\":\"opusprodda1\",\n    \"environment\":\"devel\",\n    \"user\":\"tmiconi\",\n    \"instances\":10,\n    \"isService\":false,\n    \"cronSchedule\":\"\",\n    \"custom\":{},\n    \"application\":\"testversion\",\n    \"maxRetries\":1,\n    \"constraints\":{\"sku\":\"1080ti\"},\n    \"accessTypes\":[],\n    \"dependencies\":[],\n    \"cronCollisionPolicy\":\"CANCEL_NEW\",\n    \"emailOnFail\":[],\n    \"emailOnSucceed\":[]\n}\n"
  },
  {
    "path": "omniglot/test_omniglot_allseeds.py",
    "content": "# Differentiable plasticity: Omniglot task.\n\n# Copyright (c) 2018 Uber Technologies, Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#    http://www.apache.org/licenses/LICENSE-2.0\n#\n#    Unless required by applicable law or agreed to in writing, software\n#    distributed under the License is distributed on an \"AS IS\" BASIS,\n#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n#    See the License for the specific language governing permissions and\n#    limitations under the License.\n\n\n# Using the output files produced by multiple runs of omniglot.py, test the\n# trained networks and report their performance (using withheld test classes).\n\n# NOTE: you need to specify the suffix of the files you want to test (see\n# definition of suffix below). Also be sure to use the proper directory.\n\nimport pdb \nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\nimport click\nimport numpy as np\nfrom numpy import random\nimport torch.nn.functional as F\nfrom torch import optim\nimport random\nimport sys\nimport pickle\nimport pdb\nimport time\nimport skimage\nfrom skimage import transform\nimport os\nimport platform\n\n\nimport matplotlib.pyplot as plt\nimport glob\n\nimport omniglot\n\nfrom omniglot import Network\n\n\n\nnp.set_printoptions(precision=4)\n\n\ndefaultParams = {\n# Not really used as the parameters will be read from the saved files\n    'nbclasses': 5,\n    'nbshots': 1,  # Number of 'shots' in the few-shots learning\n    'prestime': 1,\n    'nbf' : 64    ,\n    'prestimetest': 1,\n    'interpresdelay': 0,\n    'imagesize': 31,    # 28*28\n    'nbiter': 10000000,\n    'learningrate': 1e-5,\n    'print_every': 10,\n    'rngseed':0\n}\nNBTESTCLASSES = 100\n\n\n\n\n#ttype = torch.FloatTensor;\nttype = torch.cuda.FloatTensor;\n\n\n# Generate the full list of inputs, labels, and the target label for an episode\ndef generateInputsLabelsAndTarget(params, imagedata, test=False):\n    #print((\"Input Boost:\", params['inputboost']))\n    #params['nbsteps'] = params['nbshots'] * ((params['prestime'] + params['interpresdelay']) * params['nbclasses']) + params['prestimetest'] \n    inputT = np.zeros((params['nbsteps'], 1, 1, params['imagesize'], params['imagesize']))    #inputTensor, initially in numpy format... Note dimensions: number of steps x batchsize (always 1) x NbChannels (also 1) x h x w \n    labelT = np.zeros((params['nbsteps'], 1, params['nbclasses']))      #labelTensor, initially in numpy format...\n\n    patterns=[]\n    if test:\n        cats = np.random.permutation(np.arange(len(imagedata) - NBTESTCLASSES, len(imagedata)))[:params['nbclasses']]  # Which categories to use for this *testing* episode?\n    else:\n        cats = np.random.permutation(np.arange(len(imagedata) - NBTESTCLASSES))[:params['nbclasses']]  # Which categories to use for this *training* episode?\n    #print(\"Test is\", test, \", cats are\", cats)\n    #cats = np.array(range(params['nbclasses'])) + 10\n\n    cats = np.random.permutation(cats)\n    #print(cats)\n\n    # We show one picture of each category, with labels, then one picture of one of these categories as a test, without label\n    # But each of the categories may undergo rotation by 0, 90, 180 or 270deg\n    # NOTE: We randomly assign one rotation to all the possible categories, not just the ones selected for the episode - it makes the coding simpler\n    rots = np.random.randint(4, size=len(imagedata))\n\n    #rots.fill(0)\n\n    testcat = random.choice(cats) # select the class on which we'll test in this episode\n    unpermcats = cats.copy()\n\n    # Inserting the character images and labels in the input tensor at the proper places\n    location = 0\n    for nc in range(params['nbshots']):\n        np.random.shuffle(cats)   # Presentations occur in random order\n        for ii, catnum in enumerate(cats):\n            #print(catnum)\n            p = random.choice(imagedata[catnum])\n            for nr in range(rots[catnum]):\n                p = np.rot90(p)\n            p = skimage.transform.resize(p, (31, 31))\n            for nn in range(params['prestime']):\n                #numi =nc * (params['nbclasses'] * (params['prestime']+params['interpresdelay'])) + ii * (params['prestime']+params['interpresdelay']) + nn\n\n                inputT[location][0][0][:][:] = p[:][:]\n                labelT[location][0][np.where(unpermcats == catnum)] = 1\n                #if nn == 0:\n                #    print(labelT[location][0])\n                location += 1\n            location += params['interpresdelay']\n\n    # Inserting the test character\n    p = random.choice(imagedata[testcat])\n    for nr in range(rots[testcat]):\n        p = np.rot90(p)\n    p = skimage.transform.resize(p, (31, 31))\n    for nn in range(params['prestimetest']):\n        inputT[location][0][0][:][:] = p[:][:]\n        location += 1\n        \n    # Generating the test label\n    testlabel = np.zeros(params['nbclasses'])\n    testlabel[np.where(unpermcats == testcat)] = 1\n    #print(testcat, testlabel)\n\n    #pdb.set_trace()\n        \n    \n    assert(location == params['nbsteps'])\n\n    inputT = torch.from_numpy(inputT).type(ttype)  # Convert from numpy to Tensor\n    labelT = torch.from_numpy(labelT).type(ttype)\n    targetL = torch.from_numpy(testlabel).type(ttype)\n\n    return inputT, labelT, targetL\n\n\n\n\n\ndef train(paramdict=None):\n    print(\"Initializing random seeds\")\n    np.random.seed(0); random.seed(0); torch.manual_seed(0)\n    print(\"Starting testing...\")\n    params = {}\n\n    params.update(defaultParams)\n    if paramdict:\n        params.update(paramdict)\n    \n    #pdb.set_trace()\n\n    print(\"Loading Omniglot data...\")\n    imagedata = []\n    imagefilenames=[]\n    for basedir in ('./omniglot-master/python/images_background/', \n                    './omniglot-master/python/images_evaluation/'):\n        alphabetdirs = glob.glob(basedir+'*')\n        print(alphabetdirs[:4])\n        for alphabetdir in alphabetdirs:\n            chardirs = glob.glob(alphabetdir+\"/*\")\n            for chardir in chardirs:\n                chardata = []\n                charfiles = glob.glob(chardir+'/*')\n                for fn in charfiles:\n                    filedata = plt.imread(fn)\n                    chardata.append(filedata)\n                imagedata.append(chardata)\n                imagefilenames.append(fn)\n    # imagedata is now a list of lists of numpy arrays \n    # imagedata[CharactertNumber][FileNumber] -> numpy(105,105)\n    np.random.shuffle(imagedata)  # Randomize order of characters \n    print(len(imagedata))\n    print(imagedata[1][2].shape)\n    print(\"Data loaded!\")\n\n\n    successrates = []\n    totaliter = 0 \n    totalmistakes = 0\n\n    for myseed in range(10):\n\n\n        #suffix=\"_Wactiv_tanh_alpha_free_flare_0_gamma_0.75_imgsize_31_ipd_0_lr_3e-05_nbclasses_5_nbf_64_nbiter_5000000_nbshots_1_prestime_1_prestimetest_1_rule_oja_steplr_1000000.0_rngseed_\"+str(myseed)\n        suffix=\"_Wactiv_tanh_alpha_free_flare_0_gamma_0.666_imgsize_31_ipd_0_lr_3e-05_nbclasses_5_nbf_64_nbiter_5000000_nbshots_1_prestime_1_prestimetest_1_rule_oja_steplr_1000000.0_rngseed_\"+str(myseed)+\"_5000000\"\n        with open('./tmp/results'+suffix+'.dat', 'rb') as fo:\n            tmpw = torch.nn.Parameter(torch.from_numpy(pickle.load(fo)).type(ttype))\n            tmpalpha = torch.nn.Parameter(torch.from_numpy(pickle.load(fo)).type(ttype))\n            tmpeta = torch.nn.Parameter(torch.from_numpy(pickle.load(fo)).type(ttype))\n            tmplss = pickle.load(fo)\n            paramdictLoadedFromFile = pickle.load(fo)\n        \n        params.update(paramdictLoadedFromFile)\n\n        \n        print(\"Initializing network\")\n        net = Network(params)\n        #net.cuda()\n        \n        print (\"Size of all optimized parameters:\", [x.size() for x in net.parameters()])\n        allsizes = [torch.numel(x.data.cpu()) for x in net.parameters()]\n        print (\"Size (numel) of all optimized elements:\", allsizes)\n        print (\"Total size (numel) of all optimized elements:\", sum(allsizes))\n\n        \n        print(\"Passed params: \", params)\n        print(platform.uname())\n        sys.stdout.flush()\n        params['nbsteps'] = params['nbshots'] * ((params['prestime'] + params['interpresdelay']) * params['nbclasses']) + params['prestimetest']  # Total number of steps per episode\n        \n        net.load_state_dict(torch.load('./tmp/torchmodel'+suffix + '.txt'))\n        \n        \n        \n        params['nbiter'] = 100\n\n\n\n        # Initialize random seeds ; not sure if really useful here...\n        #print(\"Setting random seed to\", params['rngseed'])\n        #np.random.seed(params['rngseed']); random.seed(params['rngseed']); torch.manual_seed(params['rngseed'])\n        #print(click.get_current_context().params)\n\n        total_loss = 0.0\n        #print(\"Initializing optimizer\")\n        ##optimizer = torch.optim.Adam([net.w, net.alpha, net.eta], lr=params['learningrate'])\n        #optimizer = torch.optim.Adam(net.parameters(), lr=params['learningrate'])\n        all_losses = []\n        #print_every = 20\n        nowtime = time.time()\n        print(\"Starting episodes...\")\n        sys.stdout.flush()\n\n        nbmistakes = 0\n\n        for numiter in range(params['nbiter']):\n            \n            hebb = net.initialZeroHebb()\n            #optimizer.zero_grad()\n\n\n            is_test_step = 1\n\n            inputs, labels, target = generateInputsLabelsAndTarget(params, imagedata, test=is_test_step)\n\n\n            for numstep in range(params['nbsteps']):\n                y, hebb = net(Variable(inputs[numstep], requires_grad=False), Variable(labels[numstep], requires_grad=False), hebb)\n\n            #loss = (y[0] - Variable(target, requires_grad=False)).pow(2).sum()\n            criterion = torch.nn.BCELoss()\n            loss = criterion(y[0], Variable(target, requires_grad=False))\n\n            #if is_test_step == False:\n            #    loss.backward()\n            #    optimizer.step()\n\n            lossnum = loss.data[0]\n            #total_loss  += lossnum\n            if is_test_step:\n                total_loss = lossnum\n\n            if is_test_step: # (numiter+1) % params['print_every'] == 0:\n\n                print(numiter, \"====\")\n                td = target.cpu().numpy()\n                yd = y.data.cpu().numpy()[0]\n                #print(\"y: \", yd[:10])\n                #print(\"target: \", td[:10])\n                if np.argmax(td) != np.argmax(yd):\n                    print(\"Mistake!\")\n                    nbmistakes += 1\n                #print(\"target: \", target.unsqueeze(0)[0][:10])\n                absdiff = np.abs(td-yd)\n                #print(\"Mean / median / max abs diff:\", np.mean(absdiff), np.median(absdiff), np.max(absdiff))\n                #print(\"Correlation (full / sign): \", np.corrcoef(td, yd)[0][1], np.corrcoef(np.sign(td), np.sign(yd))[0][1])\n                #print inputs[numstep]\n                previoustime = nowtime\n                nowtime = time.time()\n                #print(\"Time spent on last\", params['print_every'], \"iters: \", nowtime - previoustime)\n                #total_loss /= params['print_every']\n                #print(\"Mean loss over last\", params['print_every'], \"iters:\", total_loss)\n                #print(\"Loss on single withheld-data episode:\", lossnum)\n                all_losses.append(total_loss)\n                #print (\"Eta: \", net.eta.data.cpu().numpy())\n                sys.stdout.flush()\n                sys.stderr.flush()\n\n                total_loss = 0\n\n        all_losses = np.array(all_losses)\n        print(\"Mean / std all losses :\", np.mean(all_losses), np.std(all_losses))\n        print(\"1st Quartile / median / 3rd Quartile all losses :\", np.percentile(all_losses, 25), np.percentile(all_losses, 50), np.percentile(all_losses, 75))\n        print(\"Max of all losses :\", np.max(all_losses))\n        print(\"Nb of mistakes :\", nbmistakes, \"over\", numiter+1, \"trials - (\", 100.0 - 100.0 * nbmistakes / (numiter+1), \" % correct )\")\n        successrates.append(100.0 - 100.0 * nbmistakes / (numiter+1))\n        totalmistakes += nbmistakes\n        totaliter += params['nbiter']\n\n    print (\"Mean / stdev success rate across runs: \", np.mean(successrates), np.std(successrates))\n    totalsuccessrate = 1.0 - totalmistakes / totaliter\n    pointestCI = 1.96 * np.sqrt(totalsuccessrate * (1.0 - totalsuccessrate) / totaliter)\n    print (\"Success % across all trials (95% CI point estimate):\", 100.0 * totalsuccessrate, \"+/-\", 100.0 * pointestCI)\n    print (totalmistakes, \"mistakes out of \", totaliter, \"trials\")\n\n\n    print (\"Median success rate across runs: \", np.median(successrates))\n\n\n@click.command()\n@click.option('--nbclasses', default=defaultParams['nbclasses'])\n@click.option('--nbshots', default=defaultParams['nbshots'])\n@click.option('--prestime', default=defaultParams['prestime'])\n@click.option('--prestimetest', default=defaultParams['prestimetest'])\n@click.option('--interpresdelay', default=defaultParams['interpresdelay'])\n@click.option('--nbiter', default=defaultParams['nbiter'])\n@click.option('--learningrate', default=defaultParams['learningrate'])\n@click.option('--print_every', default=defaultParams['print_every'])\n@click.option('--rngseed', default=defaultParams['rngseed'])\ndef main(nbclasses, nbshots, prestime, prestimetest, interpresdelay, nbiter, learningrate, print_every, rngseed):\n    train(paramdict=dict(click.get_current_context().params))\n    #print(dict(click.get_current_context().params))\n\nif __name__ == \"__main__\":\n    #train()\n    main()\n\n"
  },
  {
    "path": "opus.docker",
    "content": "#tmiconi_rl\n#latest\n#.\n\n\n#FROM localhost:5000/opus-deep-learning:master-test-2017_9_7_20_56_10\nFROM opus-deep-learning-py3:master-prod-2019_2_5_4_54_39\n#FROM opus-deep-learning:master--2018_9_20_18_2_31\n\n\n\n\nRUN mkdir /home/work\n\nCOPY ./sr/*.py /home/work/sr/\nCOPY ./sr/*.md /home/work/sr/\n\nCOPY ./maze/*.py /home/work/maze/\nCOPY ./maze/*.md /home/work/maze/\n\nCOPY ./simplemaze/*.py /home/work/simplemaze/\nCOPY ./simplemaze/*.md /home/work/simplemaze/\n\nCOPY ./awd-lstm-lm/*.py /home/work/awd-lstm-lm/\nCOPY ./awd-lstm-lm/*.sh /home/work/awd-lstm-lm/\nCOPY ./awd-lstm-lm/*.md /home/work/awd-lstm-lm/\n\n#COPY ./*.py /home/work/\n#COPY ./*.sh /home/work/\n#COPY ./*.md /home/work/\n\nENV LC_ALL C.UTF-8\nENV  LANG C.UTF-8\n\n"
  },
  {
    "path": "request_devbox.json",
    "content": "{\n    \"dockerImage\":\"tmiconi_rl\", \n    \"tag\":\"master-test-2019_5_21_10_41_12\",\n    \"cpus\":2.0,\n    \"ramMB\":26000,\n    \"gpus\":1,\n    \"diskMB\":8000,\n    \"cluster\":\"opusprodda1\",\n    \"environment\":\"devel\",\n    \"user\":\"tmiconi\",\n    \"resourcePool\": \"/ailabs/p1/tmiconi\",\n    \"instances\":1,\n    \"isService\":false,\n    \"cronSchedule\":\"\",\n    \"custom\":{},\n    \"application\":\"testversion\",\n    \"maxRetries\":1,\n    \"constraints\":{\"sku\":\"p6000\"},\n    \"accessTypes\":[],\n    \"dependencies\":[],\n    \"cronCollisionPolicy\":\"CANCEL_NEW\",\n    \"emailOnFail\":[],\n    \"emailOnSucceed\":[]\n}\n"
  },
  {
    "path": "request_lstm.json",
    "content": "{\n    \"dockerImage\":\"tmiconi_rl\", \n    \"tag\":\"master-test-2019_5_21_10_41_12\",\n    \"name\":\"newcode_PLASTICLSTM_agdiv1149_opus_alphatype_perneuron_modultype_modplasth2mod_modulout_fanout_asgdtime_125_1149n_5run\",\n    \"cpus\":2.0,\n    \"cmdLine\":\"export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 \\u0026\\u0026  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/nvidia/bin:/opt/hadoop/latest/bin \\u0026\\u0026 export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/nvidia/lib64/  \\u0026\\u0026  export LC_ALL=C.UTF-8  \\u0026\\u0026   export LANG=C.UTF-8  \\u0026\\u0026  cd /home/work/awd-lstm-lm \\u0026\\u0026 apt-get install unzip \\u0026\\u0026 sh ./getdata.sh \\u0026\\u0026 python3 main.py --batch_size 6 --data data/penn --dropouti 0.4 --dropouth 0.25  --epoch 500 --save PTB.pt --wdrop 0 --model PLASTICLSTM --modultype modplasth2mod --modulout fanout --nhid 1149  --alphatype perneuron --asgdtime 125 --agdiv 1149 --seed {{mesos.instance}} \",\n    \"ramMB\":25000,\n    \"gpus\":1,\n    \"diskMB\":6000,\n    \"cluster\":\"opusprodda1\",\n    \"environment\":\"devel\",\n    \"user\":\"tmiconi\",\n    \"resourcePool\": \"/ailabs/p1/tmiconi\",\n    \"instances\":5,\n    \"isService\":false,\n    \"cronSchedule\":\"\",\n    \"custom\":{},\n    \"application\":\"testversion\",\n    \"maxRetries\":1,\n    \"constraints\":{\"sku\":\"p6000\"},\n    \"accessTypes\":[],\n    \"dependencies\":[],\n    \"cronCollisionPolicy\":\"CANCEL_NEW\",\n    \"emailOnFail\":[],\n    \"emailOnSucceed\":[]\n}\n"
  },
  {
    "path": "request_lstm_simple.json",
    "content": "{\n    \"dockerImage\":\"tmiconi_rl\", \n    \"tag\":\"master-test-2019_5_21_10_41_12\",\n    \"name\":\"newcode_SIMPLEPLASTICLSTM_agdiv1149_opus_alphatype_perneuron_modultype_modplasth2mod_modulout_fanout_asgdtime_125_1149n_5run\",\n    \"cpus\":2.0,\n    \"cmdLine\":\"export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 \\u0026\\u0026  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/nvidia/bin:/opt/hadoop/latest/bin \\u0026\\u0026 export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/nvidia/lib64/  \\u0026\\u0026  export LC_ALL=C.UTF-8  \\u0026\\u0026   export LANG=C.UTF-8  \\u0026\\u0026  cd /home/work/awd-lstm-lm \\u0026\\u0026 apt-get install unzip \\u0026\\u0026 sh ./getdata.sh \\u0026\\u0026 python3 main.py --batch_size 6 --data data/penn --dropouti 0.4 --dropouth 0.25  --epoch 500 --save PTB.pt --wdrop 0 --model SIMPLEPLASTICLSTM --modultype modplasth2mod --modulout fanout --nhid 1149  --alphatype perneuron --asgdtime 125 --agdiv 1149 --seed {{mesos.instance}} \",\n    \"ramMB\":25000,\n    \"gpus\":1,\n    \"diskMB\":6000,\n    \"cluster\":\"opusprodda1\",\n    \"environment\":\"devel\",\n    \"user\":\"tmiconi\",\n    \"resourcePool\": \"/ailabs/p1/tmiconi\",\n    \"instances\":5,\n    \"isService\":false,\n    \"cronSchedule\":\"\",\n    \"custom\":{},\n    \"application\":\"testversion\",\n    \"maxRetries\":1,\n    \"constraints\":{\"sku\":\"p6000\"},\n    \"accessTypes\":[],\n    \"dependencies\":[],\n    \"cronCollisionPolicy\":\"CANCEL_NEW\",\n    \"emailOnFail\":[],\n    \"emailOnSucceed\":[]\n}\n"
  },
  {
    "path": "simple/.gitignore",
    "content": "*.txt\n*.dat\n"
  },
  {
    "path": "simple/OpusHdfsCopy.py",
    "content": "# Uber-only code for interacting with hdfs\n#\n# Copyright (c) 2018 Uber Technologies, Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#    http://www.apache.org/licenses/LICENSE-2.0\n#\n#    Unless required by applicable law or agreed to in writing, software\n#    distributed under the License is distributed on an \"AS IS\" BASIS,\n#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n#    See the License for the specific language governing permissions and\n#    limitations under the License.\n\nimport os\nimport os.path\n\ndef checkHdfs():\n    return os.path.isfile('/opt/hadoop/latest/bin/hdfs')\n\ndef transferFileToHdfsPath(sourcepath, targetpath):\n    hdfspath = targetpath\n    targetdir = os.path.dirname(targetpath)\n    os.system('/opt/hadoop/latest/bin/hdfs dfs -mkdir -p {}'.format(targetdir))\n    result = os.system(\n        '/opt/hadoop/latest/bin/hdfs dfs -copyFromLocal -f {} {}'.format(sourcepath, hdfspath)\n    )\n    if result != 0:\n        raise OSError('Cannot copyFromLocal {} {} returned {}'.format(sourcepath, hdfspath, result))\n\ndef transferFileToHdfsDir(sourcepath, targetdir):\n    hdfspath = os.path.join(targetdir, os.path.basename(sourcepath))\n    os.system('/opt/hadoop/latest/bin/hdfs dfs -mkdir -p {}'.format(targetdir))\n    result = os.system(\n        '/opt/hadoop/latest/bin/hdfs dfs -copyFromLocal -f {} {}'.format(sourcepath, hdfspath)\n    )\n    if result != 0:\n        raise OSError('Cannot copyFromLocal {} {} returned {}'.format(sourcepath, hdfspath, result))\n\n"
  },
  {
    "path": "simple/README.md",
    "content": "# Pattern memorization and completion\n\nThis code implements the pattern completion task. Five binary pattern of 1000 elements are shown once each, and then a degraded copy of one of these patterns (with half the elements zeroed out) is presented and must be completed.\n\nThe `simplest.py` program is the simplest, fully functional implementation of this task with a recurrent plastic network. This program is designed to provide an easily understood example for  differentiable plasticity. It requires PyTorch, but does not use a GPU.\n\n`simple.py` is a slightly more elaborate version that can make use of a GPU.\n\nThe `full.py` and `lstm.py` programs have more options and can be used to compare different architectures.\n\nTo produce the results shown in the paper:\n\n```\npython3 full.py --patternsize 50 --nbaddneurons 2000 --nbprescycles 1 --nbpatterns 2 --prestime 3 --interpresdelay 1 --nbiter 1000000 --lr 3e-5 --type nonplastic \npython3 full.py --patternsize 50 --nbaddneurons 0 --nbprescycles 1 --nbpatterns 2 --prestime 3 --interpresdelay 1 --nbiter 1000000 --lr 3e-4 --type plastic \npython3 lstm.py --patternsize 50 --nbaddneurons 1949 --nbprescycles 1 --nbpatterns 2 --prestime 3 --interpresdelay 1 --nbiter 1000000 --clamp 1 --lr 3e-5 \n```\n\n"
  },
  {
    "path": "simple/full.py",
    "content": "# Differentiable plasticity: binary pattern memorization and reconstruction\n#\n# Copyright (c) 2018 Uber Technologies, Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#    http://www.apache.org/licenses/LICENSE-2.0\n#\n#    Unless required by applicable law or agreed to in writing, software\n#    distributed under the License is distributed on an \"AS IS\" BASIS,\n#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n#    See the License for the specific language governing permissions and\n#    limitations under the License.\n#\n# This more flexible implementation includes both plastic and non-plastic RNNs. LSTM code is sufficiently different that it makes more sense to put it in a different file.\n# Also includes some Uber-specific stuff for file transfer. Commented out by default.\n\n\n# Parameters optimized for non-plastic architectures (esp. LSTM): \n# --patternsize 50 --nbaddneurons 2000 --nbprescycles 1 --nbpatterns 2 --prestime 3 --interpresdelay 1 --nbiter 1000000 --lr 3e-5\n# For comparing plastic and non-plastic, we use these for both (though the plastic architecture strongly prefers the default ones)\n# Plastic networks can learn with lr=3e-4.\n# The default parameters are those for the plastic RNN on the 1000-bit task (same as simple.py) \n\nimport argparse\nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\nimport numpy as np\nfrom numpy import random\nimport torch.nn.functional as F\nfrom torch import optim\nimport random\nimport sys\nimport pickle\nimport pdb\nimport time\n\n# Uber-only (comment out if not at Uber):\nimport OpusHdfsCopy\nfrom OpusHdfsCopy import transferFileToHdfsDir, checkHdfs\n\n\n# Parsing command-line arguments\nparams = {}; params['rngseed'] = 0\nparser = argparse.ArgumentParser()\nparser.add_argument(\"--rngseed\", type=int, help=\"random seed\", default=0)\nparser.add_argument(\"--nbiter\", type=int, help=\"number of episodes\", default=2000)\nparser.add_argument(\"--nbaddneurons\", type=int, help=\"number of additional neurons\", default=0)\nparser.add_argument(\"--lr\", type=float, help=\"learning rate of Adam optimizer\", default=3e-4)\nparser.add_argument(\"--patternsize\", type=int, help=\"size of the binary patterns\", default=1000)\nparser.add_argument(\"--nbpatterns\", type=int, help=\"number of patterns to memorize\", default=5)\nparser.add_argument(\"--nbprescycles\", type=int, help=\"number of presentation cycles\", default=2)\nparser.add_argument(\"--prestime\", type=int, help=\"number of time steps for each pattern presentation\", default=6)\nparser.add_argument(\"--interpresdelay\", type=int, help=\"number of time steps between each pattern presentation (with zero input)\", default=4)\nparser.add_argument(\"--type\", help=\"network type ('plastic' or 'nonplastic')\", default='plastic')\nargs = parser.parse_args(); argvars = vars(args); argdict =  { k : argvars[k] for k in argvars if argvars[k] != None }\nparams.update(argdict)\n\nPATTERNSIZE = params['patternsize']\nNBNEUR = PATTERNSIZE + params['nbaddneurons'] + 1  # NbNeur = Pattern Size + additional neurons + 1 \"bias\", fixed-output neuron (bias neuron not needed for this task, but included for completeness)\nETA = .01               # The \"learning rate\" of plastic connections\nADAMLEARNINGRATE = params['lr']\n\nPROBADEGRADE = .5       # Proportion of bits to zero out in the target pattern at test time\nNBPATTERNS = params['nbpatterns'] # The number of patterns to learn in each episode\nNBPRESCYCLES = params['nbprescycles']        # Number of times each pattern is to be presented\nPRESTIME = params['prestime'] # Number of time steps for each presentation\nPRESTIMETEST = PRESTIME        # Same thing but for the final test pattern\nINTERPRESDELAY = params['interpresdelay']      # Duration of zero-input interval between presentations\nNBSTEPS = NBPRESCYCLES * ((PRESTIME + INTERPRESDELAY) * NBPATTERNS) + PRESTIMETEST  # Total number of steps per episode\n\n\n\n#ttype = torch.FloatTensor;         # For CPU\nttype = torch.cuda.FloatTensor;     # For GPU\n\n\n# Generate the full list of inputs for an episode. The inputs are returned as a PyTorch tensor of shape NbSteps x 1 x NbNeur\ndef generateInputsAndTarget():\n    inputT = np.zeros((NBSTEPS, 1, NBNEUR)) #inputTensor, initially in numpy format...\n\n    # Create the random patterns to be memorized in an episode\n    seedp = np.ones(PATTERNSIZE); seedp[:PATTERNSIZE//2] = -1\n    patterns=[]\n    for nump in range(NBPATTERNS):\n        p = np.random.permutation(seedp)\n        patterns.append(p)\n\n    # Now 'patterns' contains the NBPATTERNS patterns to be memorized in this episode - in numpy format\n    # Choosing the test pattern, partially zero'ed out, that the network will have to complete\n    testpattern = random.choice(patterns).copy()\n    preservedbits = np.ones(PATTERNSIZE); preservedbits[:int(PROBADEGRADE * PATTERNSIZE)] = 0; np.random.shuffle(preservedbits)\n    degradedtestpattern = testpattern * preservedbits\n\n    # Inserting the inputs in the input tensor at the proper places\n    for nc in range(NBPRESCYCLES):\n        np.random.shuffle(patterns)\n        for ii in range(NBPATTERNS):\n            for nn in range(PRESTIME):\n                numi =nc * (NBPATTERNS * (PRESTIME+INTERPRESDELAY)) + ii * (PRESTIME+INTERPRESDELAY) + nn\n                inputT[numi][0][:PATTERNSIZE] = patterns[ii][:]\n\n    # Inserting the degraded pattern\n    for nn in range(PRESTIMETEST):\n        inputT[-PRESTIMETEST + nn][0][:PATTERNSIZE] = degradedtestpattern[:]\n\n    for nn in range(NBSTEPS):\n        inputT[nn][0][-1] = 1.0  # Bias neuron.\n        inputT[nn] *= 20.0       # Strengthen inputs\n    inputT = torch.from_numpy(inputT).type(ttype)  # Convert from numpy to Tensor\n    target = torch.from_numpy(testpattern).type(ttype)\n\n    return inputT, target\n\n\n\n\nclass NETWORK(nn.Module):\n    def __init__(self):\n        super(NETWORK, self).__init__()\n        # Notice that the vectors are row vectors, and the matrices are transposed wrt the usual order, following apparent pytorch conventions\n        # Each *column* of w targets a single output neuron\n        self.w = Variable(.01 * torch.randn(NBNEUR, NBNEUR).type(ttype), requires_grad=True)   # The matrix of fixed (baseline) weights\n        self.alpha = Variable(.01 * torch.randn(NBNEUR, NBNEUR).type(ttype), requires_grad=True)  # The matrix of plasticity coefficients\n        self.eta = Variable(.01 * torch.ones(1).type(ttype), requires_grad=True)  # The eta coefficient is learned\n        self.zeroDiagAlpha()  # No plastic autapses\n\n    def forward(self, input, yin, hebb):\n        # Run the network for one timestep\n        if params['type'] == 'plastic':\n            yout = F.tanh( yin.mm(self.w + torch.mul(self.alpha, hebb)) + input )\n            hebb = (1 - self.eta) * hebb + self.eta * torch.bmm(yin.unsqueeze(2), yout.unsqueeze(1))[0] # bmm used to implement outer product with the help of unsqueeze (i.e. added empty dimensions)\n        elif params['type'] == 'nonplastic':\n            yout = F.tanh( yin.mm(self.w) + input )\n        else:\n            raise ValueError(\"Wrong network type!\")\n        return yout, hebb\n\n    def initialZeroState(self):\n        return Variable(torch.zeros(1, NBNEUR).type(ttype))\n\n    def initialZeroHebb(self):\n        return Variable(torch.zeros(NBNEUR, NBNEUR).type(ttype))\n\n    def zeroDiagAlpha(self):\n        # Zero out the diagonal of the matrix of alpha coefficients: no plastic autapses\n        self.alpha.data -= torch.diag(torch.diag(self.alpha.data))\n\n\nnp.set_printoptions(precision=3)\nnp.random.seed(params['rngseed']); random.seed(params['rngseed']); torch.manual_seed(params['rngseed'])\n\n\n\n\nnet = NETWORK()\noptimizer = torch.optim.Adam([net.w, net.alpha, net.eta], lr=ADAMLEARNINGRATE)\ntotal_loss = 0.0; all_losses = []\nprint_every = 100\nsave_every = 1000\nnowtime = time.time()\nsuffix = \"binary_\"+\"\".join([str(x)+\"_\" if pair[0] is not 'nbsteps' and pair[0] is not 'rngseed' and pair[0] is not 'save_every' and pair[0] is not 'test_every' else '' for pair in sorted(zip(params.keys(), params.values()), key=lambda x:x[0] ) for x in pair])[:-1] + \"_rngseed_\" + str(params['rngseed'])   # Turning the parameters into a nice suffix for filenames\n\nfor numiter in range(params['nbiter']):\n    # Initialize network for each episode\n    y = net.initialZeroState()\n    hebb = net.initialZeroHebb()\n    optimizer.zero_grad()\n\n    # Generate the inputs and target pattern for this episode\n    inputs, target = generateInputsAndTarget()\n\n    # Run the episode!\n    for numstep in range(NBSTEPS):\n        y, hebb = net(Variable(inputs[numstep], requires_grad=False), y, hebb)\n\n    # Compute loss for this episode (last step only)\n    loss = (y[0][:PATTERNSIZE] - Variable(target, requires_grad=False)).pow(2).sum()\n\n    # Apply backpropagation to adapt basic weights and plasticity coefficients\n    loss.backward()\n    optimizer.step()\n    #net.zeroDiagAlpha()  # Removes plastic autapses - turned out to be unneeded\n\n    # That's it for the actual algorithm.\n    # Print statistics, save files\n    #lossnum = loss.data[0]     # Saved loss is the actual training loss (MSE)\n    to = target.cpu().numpy(); yo = y.data.cpu().numpy()[0][:PATTERNSIZE]; z = (np.sign(yo) != np.sign(to)); lossnum = np.mean(z)  # Saved loss is the error rate\n    total_loss  += lossnum\n    if (numiter+1) % print_every == 0:\n        print((numiter, \"====\"))\n        print(target.cpu().numpy()[-10:])   # Target pattern to be reconstructed\n        print(inputs.cpu().numpy()[numstep][0][-10:])  # Last input contains the degraded pattern fed to the network at test time\n        print(y.data.cpu().numpy()[0][-10:])   # Final output of the network\n        previoustime = nowtime\n        nowtime = time.time()\n        print(\"Time spent on last\", print_every, \"iters: \", nowtime - previoustime)\n        total_loss /= print_every\n        all_losses.append(total_loss)\n        print(\"Mean loss over last\", print_every, \"iters:\", total_loss)\n        print(\"\")\n    if (numiter+1) % save_every == 0:\n        with open('outputs_'+suffix+'.dat', 'wb') as fo:\n            pickle.dump(net.w.data.cpu().numpy(), fo)\n            pickle.dump(net.alpha.data.cpu().numpy(), fo)\n            pickle.dump(y.data.cpu().numpy(), fo)  # The final y for this episode\n            pickle.dump(all_losses, fo)\n        with open('loss_'+suffix+'.txt', 'w') as fo:\n            for item in all_losses:\n                fo.write(\"%s\\n\" % item)\n        # Uber-only\n        if checkHdfs():\n            print(\"Transfering to HDFS...\")\n            transferFileToHdfsDir('loss_'+suffix+'.txt', '/ailabs/tmiconi/simple/')\n            #transferFileToHdfsDir('results_simple_'+str(params['rngseed'])+'.dat', '/ailabs/tmiconi/exp/')\n\n        total_loss = 0\n\n\n\n"
  },
  {
    "path": "simple/lstm.py",
    "content": "# Memorization of two 50-bit binary patterns per episode, with LSTMs. Takes a very long time to learn the task, and even then imperfectly. 2050 neurons (fewer neurons = worse performance).\n#\n#\n# Copyright (c) 2018 Uber Technologies, Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#    http://www.apache.org/licenses/LICENSE-2.0\n#\n#    Unless required by applicable law or agreed to in writing, software\n#    distributed under the License is distributed on an \"AS IS\" BASIS,\n#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n#    See the License for the specific language governing permissions and\n#    limitations under the License.\n\nimport argparse\nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\nimport numpy as np\nfrom numpy import random\nimport torch.nn.functional as F\nfrom torch import optim\nimport random\nimport sys\nimport pickle as pickle\nimport pdb\nimport time\n\n# Uber-only (comment out if not at Uber)\nimport OpusHdfsCopy\nfrom OpusHdfsCopy import transferFileToHdfsDir, checkHdfs\n\n# Parsing command-line arguments\nparams = {}; params['rngseed'] = 0\nparser = argparse.ArgumentParser()\nparser.add_argument(\"--rngseed\", type=int, help=\"random seed\", default=0)\nparser.add_argument(\"--nbiter\", type=int, help=\"number of episodes\", default=2000)\nparser.add_argument(\"--clamp\", type=int, help=\"whether inputs are clamping (1) or not (0)\", default=1)\nparser.add_argument(\"--nbaddneurons\", type=int, help=\"number of additional neurons\", default=0)\nparser.add_argument(\"--lr\", type=float, help=\"learning rate of Adam optimizer\", default=3e-4)\nparser.add_argument(\"--patternsize\", type=int, help=\"size of the binary patterns\", default=1000)\nparser.add_argument(\"--nbpatterns\", type=int, help=\"number of patterns to memorize\", default=5)\nparser.add_argument(\"--nbprescycles\", type=int, help=\"number of presentation cycles\", default=2)\nparser.add_argument(\"--prestime\", type=int, help=\"number of time steps for each pattern presentation\", default=6)\nparser.add_argument(\"--interpresdelay\", type=int, help=\"number of time steps between each pattern presentation (with zero input)\", default=4)\nparser.add_argument(\"--type\", help=\"network type ('plastic' or 'nonplastic')\", default='plastic')\nargs = parser.parse_args(); argvars = vars(args); argdict =  { k : argvars[k] for k in argvars if argvars[k] != None }\nparams.update(argdict)\n\nPATTERNSIZE = params['patternsize']\nNBHIDDENNEUR = PATTERNSIZE + params['nbaddneurons'] + 1  # NbNeur = Pattern Size + additional neurons + 1 \"bias\", fixed-output neuron (bias neuron not needed for this task, but included for completeness)\nETA = .01               # The \"learning rate\" of plastic connections; not used for LSTMs\nADAMLEARNINGRATE = params['lr']\n\nPROBADEGRADE = .5       # Proportion of bits to zero out in the target pattern at test time\nCLAMPING = params['clamp']\nNBPATTERNS = params['nbpatterns'] # The number of patterns to learn in each episode\nNBPRESCYCLES = params['nbprescycles']        # Number of times each pattern is to be presented\nPRESTIME = params['prestime'] # Number of time steps for each presentation\nPRESTIMETEST = PRESTIME        # Same thing but for the final test pattern\nINTERPRESDELAY = params['interpresdelay']      # Duration of zero-input interval between presentations\nNBSTEPS = NBPRESCYCLES * ((PRESTIME + INTERPRESDELAY) * NBPATTERNS) + PRESTIMETEST  # Total number of steps per episode\n\nRNGSEED = params['rngseed']\n\n#PATTERNSIZE = 50 \n#\n## Note: For LSTM, there are PATTERNSIZE input and output neurons, and NBHIDDENNEUR neurons in the hidden recurrent layer\n##NBNEUR = PATTERNSIZE  # NbNeur = Pattern Size + 1 \"bias\", fixed-output neuron (bias neuron not needed for this task, but included for completeness)\n#NBHIDDENNEUR = 2000  #  1000 takes longer \n#\n##ETA = .01               # The \"learning rate\" of plastic connections. Not used for LSTMs.\n#ADAMLEARNINGRATE = 3e-5 # 1e-4  # 3e-5 works better in the long run. 1e-4 OK. 3e-4 fails.\n#RNGSEED = 0\n#\n#PROBADEGRADE = .5       # Proportion of bits to zero out in the target pattern at test time\n#NBPATTERNS = 2          # The number of patterns to learn in each episode\n#NBPRESCYCLES = 1        # Number of times each pattern is to be presented\n#PRESTIME = 3            # Number of time steps for each presentation\n#PRESTIMETEST = 3        # Same thing but for the final test pattern\n#INTERPRESDELAY = 1      # Duration of zero-input interval between presentations\n#NBSTEPS = NBPRESCYCLES * ((PRESTIME + INTERPRESDELAY) * NBPATTERNS) + PRESTIMETEST  # Total number of steps per episode\n\n#ttype = torch.FloatTensor;\nttype = torch.cuda.FloatTensor;\n\n# Generate the full list of inputs for an episode. The inputs are returned as a PyTorch tensor of shape NbSteps x 1 x NbNeur\ndef generateInputsAndTarget():\n    #inputT = np.zeros((NBSTEPS, 1, NBNEUR)) #inputTensor, initially in numpy format...\n    inputT = np.zeros((NBSTEPS, 1, PATTERNSIZE)) #inputTensor, initially in numpy format...\n\n    # Create the random patterns to be memorized in an episode\n    seedp = np.ones(PATTERNSIZE); seedp[:PATTERNSIZE//2] = -1\n    patterns=[]\n    for nump in range(NBPATTERNS):\n        p = np.random.permutation(seedp)\n        patterns.append(p)\n\n    # Now 'patterns' contains the NBPATTERNS patterns to be memorized in this episode - in numpy format\n    # Choosing the test pattern, partially zero'ed out, that the network will have to complete\n    \n\n    testpattern = random.choice(patterns).copy()\n    #testpattern = patterns[1].copy()\n   \n    \n    preservedbits = np.ones(PATTERNSIZE); preservedbits[:int(PROBADEGRADE * PATTERNSIZE)] = 0; np.random.shuffle(preservedbits)\n    degradedtestpattern = testpattern * preservedbits\n\n    # Inserting the inputs in the input tensor at the proper places\n    for nc in range(NBPRESCYCLES):\n        np.random.shuffle(patterns)\n        for ii in range(NBPATTERNS):\n            for nn in range(PRESTIME):\n                numi =nc * (NBPATTERNS * (PRESTIME+INTERPRESDELAY)) + ii * (PRESTIME+INTERPRESDELAY) + nn\n                inputT[numi][0][:PATTERNSIZE] = patterns[ii][:]\n\n    # Inserting the degraded pattern\n    for nn in range(PRESTIMETEST):\n        inputT[-PRESTIMETEST + nn][0][:PATTERNSIZE] = degradedtestpattern[:]\n\n    for nn in range(NBSTEPS):\n        #inputT[nn][0][-1] = 1.0  # Bias neuron.\n        inputT[nn] *= 100.0       # Strengthen inputs\n    inputT = torch.from_numpy(inputT).type(ttype)  # Convert from numpy to Tensor\n    target = torch.from_numpy(testpattern).type(ttype)\n\n    return inputT, target\n\n\n\n\n\n\nclass NETWORK(nn.Module):\n    def __init__(self):\n        super(NETWORK, self).__init__()\n        self.lstm = torch.nn.LSTM(PATTERNSIZE, NBHIDDENNEUR).cuda() #input size, hidden size\n        self.hidden = self.initialZeroState() # Note that the \"hidden state\" is a tuple (hidden state, cells state)\n\n\n    def forward(self, inputs,):\n        # Run the network over entire sequence of inputs\n        self.hidden = self.initialZeroState()\n        if CLAMPING:\n            # This code allows us to make the inputs on the LSTM \"clamping\",\n            # i.e. neurons that receive an input have their output clamped at\n            # this value, to make it similar to the RNN architectures.\n            #\n            # Note that you get worse results if you don't use it ! (\"CLAMPING = 0\" above) (clamping automatically reduces chance error to ~.25, since all input bits are always correct)\n            #\n            #self.lstm.weight_hh_l0.data.fill_(0)\n            #self.lstm.weight_ih_l0.data.fill_(0)\n            self.lstm.bias_hh_l0.data.fill_(0)\n            #self.lstm.bias_ih_l0.data.fill_(0)\n            for ii in range(PATTERNSIZE):\n                self.lstm.weight_ih_l0.data[2*NBHIDDENNEUR + ii].fill_(0)\n                self.lstm.weight_ih_l0.data[2*NBHIDDENNEUR + ii][ii] = 10.0  # Trick to make inputs clamping on the cells, for fair comparison (need to also set input gates...)\n                self.lstm.bias_ih_l0.data[0*NBHIDDENNEUR+ ii]= 10.0 # bias to input gate\n                self.lstm.bias_ih_l0.data[1*NBHIDDENNEUR+ ii]= -1000.0 # bias to forget gate (actually a persistence gate? - sigmoid, so to set it to 0, put a massive negative bias)\n                self.lstm.bias_ih_l0.data[2*NBHIDDENNEUR+ ii]= 0 # bias to cell gate\n                self.lstm.bias_ih_l0.data[3*NBHIDDENNEUR+ ii]= 10.0 # bias to output gate; sigmoid\n        lstm_out, self.hidden = self.lstm(inputs, self.hidden)\n        #o = self.h2o(lstm_out) #.view(NBSTEPS, -1))\n        #outputz = F.tanh(o)\n        outputz = lstm_out\n        return outputz\n\n\n        #yout = F.tanh( yin.mm(self.w + torch.mul(self.alpha, hebb)) + input )\n        #hebb = (1 - ETA) * hebb + ETA * torch.bmm(yin.unsqueeze(2), yout.unsqueeze(1))[0] # bmm used to implement outer product with the help of unsqueeze (i.e. added empty dimensions)\n        #return yout, hebb\n\n    def initialZeroState(self):\n        return (Variable(torch.zeros(1, 1, NBHIDDENNEUR).type(ttype)),\n                                Variable(torch.zeros(1, 1, NBHIDDENNEUR).type(ttype)))\n\n\nif len(sys.argv) == 2:\n    RNGSEED = int(sys.argv[1])\n    print(\"Setting RNGSEED to \"+str(RNGSEED))\n\nnp.set_printoptions(precision=3)\nnp.random.seed(RNGSEED); random.seed(RNGSEED); torch.manual_seed(RNGSEED)\n\n\nnet = NETWORK()\noptimizer = torch.optim.Adam(net.parameters(), lr=ADAMLEARNINGRATE)\ntotal_loss = 0.0; all_losses = []\nprint_every = 100\nsave_every = 1000\nnowtime = time.time()\n\nfor numiter in range(params['nbiter']):\n    \n    optimizer.zero_grad()\n\n    net.hidden = net.initialZeroState()\n\n    # Generate the inputs and target pattern for this episode\n    inputs, target = generateInputsAndTarget()\n\n    # Run the episode!\n    y = net(Variable(inputs, requires_grad=False))[-1][0]\n\n    # Compute loss for this episode (last step only)\n    loss = (y[:PATTERNSIZE] - Variable(target, requires_grad=False)).pow(2).sum()\n    \n    \n    #pdb.set_trace()\n\n    # Apply backpropagation to adapt basic weights and plasticity coefficients\n    loss.backward()\n    optimizer.step()\n\n    # That's it for the actual algorithm.\n    # Print statistics, save files\n    #lossnum = loss.data[0]\n    yo = y.data.cpu().numpy()[:PATTERNSIZE]\n    to = target.cpu().numpy()\n    z = (np.sign(yo) != np.sign(to))\n    lossnum = np.mean(z)\n    total_loss  += lossnum\n    if (numiter+1) % print_every == 0:\n        print((numiter, \"====\"))\n        print(target.cpu().numpy()[:10])   # Target pattern to be reconstructed\n        print(inputs.cpu().numpy()[-1][0][:10])  # Last input contains the degraded pattern fed to the network at test time\n        print(y.data.cpu().numpy()[:10])   # Final output of the network\n        previoustime = nowtime\n        nowtime = time.time()\n        print(\"Time spent on last\", print_every, \"iters: \", nowtime - previoustime)\n        total_loss /= print_every\n        all_losses.append(total_loss)\n        print(\"Mean loss over last\", print_every, \"iters:\", total_loss)\n        print(\"\")\n    if (numiter+1) % save_every == 0:\n        fname = 'loss_binary_lstm_nbiter_'+str(params['nbiter'])+'_nbhneur_'+str(NBHIDDENNEUR)+'_clamp_'+str(CLAMPING)+'_lr_'+str(ADAMLEARNINGRATE)+'_prestime_'+str(PRESTIME)+'_ipd_'+str(INTERPRESDELAY)+'_rngseed_'+str(RNGSEED)+'.txt'\n        with open(fname, 'w') as fo:\n            for item in all_losses:\n                fo.write(\"%s\\n\" % item)\n\n        # Uber-only (comment out if not at Uber)\n        if checkHdfs():\n            print(\"Transfering to HDFS...\")\n            transferFileToHdfsDir(fname, '/ailabs/tmiconi/simple/')\n\n        total_loss = 0\n\n\n\n"
  },
  {
    "path": "simple/opus.docker",
    "content": "#tmiconi_rl\n#latest\n#.\n\n\n#FROM localhost:5000/opus-deep-learning:master-test-2017_9_7_20_56_10\nFROM localhost:5000/opus-deep-learning:master-test-2018_1_3_0_38_14\n\n\n\n\nRUN mkdir /home/work\n\nCOPY ./*.py /home/work/\n\nENV LC_ALL C.UTF-8\nENV  LANG C.UTF-8\n\n"
  },
  {
    "path": "simple/plotresults.py",
    "content": "# Code to plot learning curves\n#\n#\n# Copyright (c) 2018 Uber Technologies, Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#    http://www.apache.org/licenses/LICENSE-2.0\n#\n#    Unless required by applicable law or agreed to in writing, software\n#    distributed under the License is distributed on an \"AS IS\" BASIS,\n#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n#    See the License for the specific language governing permissions and\n#    limitations under the License.\n\n \nimport numpy as np\nimport glob\nimport matplotlib.pyplot as plt\n\nfnames = glob.glob('./tmp/loss_simple_*.txt')\n#fnames = glob.glob('./tmp/loss_api_*.txt')\n#fnames = glob.glob('./tmp/loss_fixed_*.txt')\n\n\nplt.ion()\nplt.rc('font', size=12)\nplt.figure(figsize=(5,4))  # Smaller figure = relative larger fonts\n\n\nfulllosses=[]\nlosses=[]\nlgts=[]\nfor fn in fnames:\n    z = np.loadtxt(fn)\n    lgts.append(len(z))\n    fulllosses.append(z)\nminlen = min(lgts)\nfor z in fulllosses:\n    losses.append(z[:minlen])\n\nlosses = np.array(losses)\nmeanl = np.mean(losses, axis=0)\nstdl = np.std(losses, axis=0)\n\nhighl = np.max(losses, axis=0)\nlowl = np.min(losses, axis=0)\n#highl = meanl+stdl\n#lowl = meanl-stdl\n\nxx = range(len(meanl))\n\n# xticks and labels\nxt = range(0, len(meanl), 50)\nxtl = [str(10*i) for i in xt]\n\nplt.fill_between(xx, lowl, highl, color='blue', alpha=.5)\nplt.plot(meanl, color='blue')\n#plt.xlabel('Loss (sum square diff. b/w final output and target)')\nplt.xlabel('Number of Episodes')\nplt.ylabel('Loss')\nplt.xticks(xt, xtl)\nplt.tight_layout()\n\n\n"
  },
  {
    "path": "simple/request.json",
    "content": "{\n    \"dockerImage\":\"tmiconi_rl\", \n    \"tag\":\"master-test-2018_6_5_9_32_56\",\n    \"name\":\"Exp_simple_1Miter_0addneur_plastic_lr3e-5\",\n    \"cpus\":2.0,\n    \"cmdLine\":\"export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 \\u0026\\u0026  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/nvidia/bin:/opt/hadoop/latest/bin \\u0026\\u0026 export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/nvidia/lib64/  \\u0026\\u0026  export LC_ALL=C.UTF-8  \\u0026\\u0026   export LANG=C.UTF-8  \\u0026\\u0026  cd /home/work/ \\u0026\\u0026 python3 full.py --patternsize 50 --nbaddneurons 0 --nbprescycles 1 --nbpatterns 2 --prestime 3 --interpresdelay 1 --nbiter 1000000 --lr 3e-5 --type plastic --rngseed {{mesos.instance}}\",\n    \"ramMB\":8000,\n    \"gpus\":1,\n    \"diskMB\":8000,\n    \"cluster\":\"opusprodda1\",\n    \"environment\":\"devel\",\n    \"user\":\"tmiconi\",\n    \"resourcePool\": \"/ailabs/p2/tmiconi\",\n    \"instances\":10,\n    \"isService\":false,\n    \"cronSchedule\":\"\",\n    \"custom\":{},\n    \"application\":\"testversion\",\n    \"maxRetries\":1,\n    \"constraints\":{\"sku\":\"1080ti\"},\n    \"accessTypes\":[],\n    \"dependencies\":[],\n    \"cronCollisionPolicy\":\"CANCEL_NEW\",\n    \"emailOnFail\":[],\n    \"emailOnSucceed\":[]\n}\n"
  },
  {
    "path": "simple/request_lstm.json",
    "content": "{\n    \"dockerImage\":\"tmiconi_rl\", \n    \"tag\":\"master-test-2018_6_6_16_30_17\",\n    \"name\":\"ExpD_simple_lstm_1Miter_1949addneur_clamp1_lr3e-5\",\n    \"cpus\":2.0,\n    \"cmdLine\":\"export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 \\u0026\\u0026  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/nvidia/bin:/opt/hadoop/latest/bin \\u0026\\u0026 export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/nvidia/lib64/  \\u0026\\u0026  export LC_ALL=C.UTF-8  \\u0026\\u0026   export LANG=C.UTF-8  \\u0026\\u0026  cd /home/work/ \\u0026\\u0026 python3 lstm.py --patternsize 50 --nbaddneurons 1949 --nbprescycles 1 --nbpatterns 2 --prestime 3 --interpresdelay 1 --nbiter 1000000 --clamp 1 --lr 3e-5 --rngseed {{mesos.instance}}\",\n    \"ramMB\":8000,\n    \"gpus\":1,\n    \"diskMB\":8000,\n    \"cluster\":\"opusprodda1\",\n    \"environment\":\"devel\",\n    \"user\":\"tmiconi\",\n    \"resourcePool\": \"/ailabs/p2/tmiconi\",\n    \"instances\":10,\n    \"isService\":false,\n    \"cronSchedule\":\"\",\n    \"custom\":{},\n    \"application\":\"testversion\",\n    \"maxRetries\":1,\n    \"constraints\":{\"sku\":\"1080ti\"},\n    \"accessTypes\":[],\n    \"dependencies\":[],\n    \"cronCollisionPolicy\":\"CANCEL_NEW\",\n    \"emailOnFail\":[],\n    \"emailOnSucceed\":[]\n}\n"
  },
  {
    "path": "simple/simple.py",
    "content": "# Differentiable plasticity: simple binary pattern memorization and reconstruction.\n#\n# Copyright (c) 2018 Uber Technologies, Inc.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#    http://www.apache.org/licenses/LICENSE-2.0\n#\n#    Unless required by applicable law or agreed to in writing, software\n#    distributed under the License is distributed on an \"AS IS\" BASIS,\n#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n#    See the License for the specific language governing permissions and\n#    limitations under the License.\n\n# This program is meant as a simple instructional example for differentiable plasticity. It is fully functional but not very flexible.\n\n# Usage: python simple.py [rngseed], where rngseed is an optional parameter specifying the seed of the random number generator. \n# To use it on a GPU or CPU, toggle comments on the 'ttype' declaration below.\n\n\n\n\n\nimport argparse\nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\nimport numpy as np\nfrom numpy import random\nimport torch.nn.functional as F\nfrom torch import optim\nimport random\nimport sys\nimport pickle as pickle\nimport pdb\nimport time\n\n\nPATTERNSIZE = 1000\nNBNEUR = PATTERNSIZE+1  # NbNeur = Pattern Size + 1 \"bias\", fixed-output neuron (bias neuron not needed for this task, but included for completeness)\n#ETA = .01               # The \"learning rate\" of plastic connections - we actually learn it\nADAMLEARNINGRATE =3e-4  # The learning rate of the Adam optimizer \nRNGSEED = 0             # Initial random seed - can be modified by passing a number as command-line argument\n\n# Note that these patterns are likely not optimal\nPROBADEGRADE = .5       # Proportion of bits to zero out in the target pattern at test time\nNBPATTERNS = 5          # The number of patterns to learn in each episode\nNBPRESCYCLES = 2        # Number of times each pattern is to be presented\nPRESTIME = 6            # Number of time steps for each presentation\nPRESTIMETEST = 6        # Same thing but for the final test pattern\nINTERPRESDELAY = 4      # Duration of zero-input interval between presentations\nNBSTEPS = NBPRESCYCLES * ((PRESTIME + INTERPRESDELAY) * NBPATTERNS) + PRESTIMETEST  # Total number of steps per episode\n\n\nif len(sys.argv) == 2:\n    RNGSEED = int(sys.argv[1])\n    print(\"Setting RNGSEED to \"+str(RNGSEED))\nnp.set_printoptions(precision=3)\nnp.random.seed(RNGSEED); random.seed(RNGSEED); torch.manual_seed(RNGSEED)\n\n\n#ttype = torch.FloatTensor;         # For CPU\nttype = torch.cuda.FloatTensor;     # For GPU\n\n\n# Generate the full list of inputs for an episode. The inputs are returned as a PyTorch tensor of shape NbSteps x 1 x NbNeur\ndef generateInputsAndTarget():\n    inputT = np.zeros((NBSTEPS, 1, NBNEUR)) #inputTensor, initially in numpy format...\n\n    # Create the random patterns to be memorized in an episode\n    seedp = np.ones(PATTERNSIZE); seedp[:PATTERNSIZE//2] = -1\n    patterns=[]\n    for nump in range(NBPATTERNS):\n        p = np.random.permutation(seedp)\n        patterns.append(p)\n\n    # Now 'patterns' contains the NBPATTERNS patterns to be memorized in this episode - in numpy format\n    # Choosing the test pattern, partially zero'ed out, that the network will have to complete\n    testpattern = random.choice(patterns).copy()\n    preservedbits = np.ones(PATTERNSIZE); preservedbits[:int(PROBADEGRADE * PATTERNSIZE)] = 0; np.random.shuffle(preservedbits)\n    degradedtestpattern = testpattern * preservedbits\n\n    # Inserting the inputs in the input tensor at the proper places\n    for nc in range(NBPRESCYCLES):\n        np.random.shuffle(patterns)\n        for ii in range(NBPATTERNS):\n            for nn in range(PRESTIME):\n                numi =nc * (NBPATTERNS * (PRESTIME+INTERPRESDELAY)) + ii * (PRESTIME+INTERPRESDELAY) + nn\n                inputT[numi][0][:PATTERNSIZE] = patterns[ii][:]\n\n    # Inserting the degraded pattern\n    for nn in range(PRESTIMETEST):\n        inputT[-PRESTIMETEST + nn][0][:PATTERNSIZE] = degradedtestpattern[:]\n\n    for nn in range(NBSTEPS):\n        inputT[nn][0][-1] = 1.0  # Bias neuron.\n        inputT[nn] *= 20.0       # Strengthen inputs\n    inputT = torch.from_numpy(inputT).type(ttype)  # Convert from numpy to Tensor\n    target = torch.from_numpy(testpattern).type(ttype)\n\n    return inputT, target\n\n\n\nclass NETWORK(nn.Module):\n    def __init__(self):\n        super(NETWORK, self).__init__()\n        # Notice that the vectors are row vectors, and the matrices are transposed wrt the usual order, following apparent pytorch conventions\n        # Each *column* of w targets a single output neuron\n        self.w = Variable(.01 * torch.randn(NBNEUR, NBNEUR).type(ttype), requires_grad=True)   # The matrix of fixed (baseline) weights\n        self.alpha = Variable(.01 * torch.randn(NBNEUR, NBNEUR).type(ttype), requires_grad=True)  # The matrix of plasticity coefficients\n        self.eta = Variable(.01 * torch.ones(1).type(ttype), requires_grad=True)  # The weight decay term / \"learning rate\" of plasticity - trainable, but shared across all connections\n\n    def forward(self, input, yin, hebb):\n        # Run the network for one timestep\n        yout = F.tanh( yin.mm(self.w + torch.mul(self.alpha, hebb)) + input )\n        hebb = (1 - self.eta) * hebb + self.eta * torch.bmm(yin.unsqueeze(2), yout.unsqueeze(1))[0] # bmm here is used to implement an outer product between yin and yout, with the help of unsqueeze (i.e. added empty dimensions)\n        return yout, hebb\n\n    def initialZeroState(self):\n        # Return an initialized, all-zero hidden state\n        return Variable(torch.zeros(1, NBNEUR).type(ttype))\n\n    def initialZeroHebb(self):\n        # Return an initialized, all-zero Hebbian trace\n        return Variable(torch.zeros(NBNEUR, NBNEUR).type(ttype))\n\n\nnet = NETWORK()\noptimizer = torch.optim.Adam([net.w, net.alpha, net.eta], lr=ADAMLEARNINGRATE)\ntotal_loss = 0.0; all_losses = []\nprint_every = 10\nnowtime = time.time()\n\nfor numiter in range(2000):\n    # Initialize network for each episode\n    y = net.initialZeroState()\n    hebb = net.initialZeroHebb()\n    optimizer.zero_grad()\n\n    # Generate the inputs and target pattern for this episode\n    inputs, target = generateInputsAndTarget()\n\n    # Run the episode!\n    for numstep in range(NBSTEPS):\n        y, hebb = net(Variable(inputs[numstep], requires_grad=False), y, hebb)\n\n    # Compute loss for this episode (last step only)\n    loss = (y[0][:PATTERNSIZE] - Variable(target, requires_grad=False)).pow(2).sum()\n\n    # Apply backpropagation to adapt basic weights and plasticity coefficients\n    loss.backward()\n    optimizer.step()\n\n\n    # That's it for the actual algorithm!\n    # Print statistics, save files\n    #lossnum = loss.data[0]   # Saved loss is the actual learning loss (MSE)\n    to = target.cpu().numpy(); yo = y.data.cpu().numpy()[0][:PATTERNSIZE]; z = (np.sign(yo) != np.sign(to)); lossnum = np.mean(z)  # Saved loss is the error rate\n    \n    total_loss  += lossnum\n    if (numiter+1) % print_every == 0:\n        print((numiter, \"====\"))\n        print(target.cpu().numpy()[-10:])   # Target pattern to be reconstructed\n        print(inputs.cpu().numpy()[numstep][0][-10:])  # Last input contains the degraded pattern fed to the network at test time\n        print(y.data.cpu().numpy()[0][-10:])   # Final output of the network\n        previoustime = nowtime\n        nowtime = time.time()\n        print(\"Time spent on last\", print_every, \"iters: \", nowtime - previoustime)\n        total_loss /= print_every\n        all_losses.append(total_loss)\n        print(\"Mean loss over last\", print_every, \"iters:\", total_loss)\n        print(\"\")\n        with open('output_simple_'+str(RNGSEED)+'.dat', 'wb') as fo:\n            pickle.dump(net.w.data.cpu().numpy(), fo)\n            pickle.dump(net.alpha.data.cpu().numpy(), fo)\n            pickle.dump(y.data.cpu().numpy(), fo)  # The final y for this episode\n            pickle.dump(all_losses, fo)\n        with open('loss_simple_'+str(RNGSEED)+'.txt', 'w') as fo:\n            for item in all_losses:\n                fo.write(\"%s\\n\" % item)\n        total_loss = 0\n\n\n\n"
  },
  {
    "path": "simple/simplest.py",
    "content": "# Differentiable plasticity: simplest fully functional code.\n\n# Copyright (c) 2018 Uber Technologies, Inc.\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#    http://www.apache.org/licenses/LICENSE-2.0\n#    Unless required by applicable law or agreed to in writing, software\n#    distributed under the License is distributed on an \"AS IS\" BASIS,\n#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n#    See the License for the specific language governing permissions and\n#    limitations under the License.\n\n\n# This is a very simple, but fully functional implementation of Differentiable\n# Plasticity. It implements the binary pattern completion task discussed in\n# Section 4.1 of Miconi et al. ICML 2018 (https://arxiv.org/abs/1804.02464).\n\n# The code implements a simple RNN with plastic weights. It requires PyTorch,\n# but does not use a GPU.\n\n# The actual code that specifically implements plasticity\n# amounts to less than 4 lines of code in total (see Section\n# S1 in the paper cited above).\n\n\nimport argparse\nimport torch\nfrom torch.autograd import Variable\nimport numpy as np\nimport torch.nn.functional as F\nimport random\nimport time\n\nPATTERNSIZE = 1000      # Size of the patterns to memorize \nNBNEUR = PATTERNSIZE    # One neuron per pattern element\nNBPATTERNS = 5          # The number of patterns to learn in each episode\nNBPRESCYCLES = 2        # Number of times each pattern is to be presented\nPRESTIME = 6            # Number of time steps for each presentation\nPRESTIMETEST = 6        # Same thing but for the final test pattern\nINTERPRESDELAY = 4      # Duration of zero-input interval between presentations\nNBSTEPS = NBPRESCYCLES * ((PRESTIME + INTERPRESDELAY) * NBPATTERNS) + PRESTIMETEST  # Total number of steps per episode\n\n\n# Generate the full list of inputs, as well as the target output at last time step, for an episode. \ndef generateInputsAndTarget():\n    inputT = np.zeros((NBSTEPS, 1, NBNEUR)) #inputTensor, initially in numpy format\n    # Create the random patterns to be memorized in an episode\n    patterns=[]\n    for nump in range(NBPATTERNS):\n        patterns.append(2*np.random.randint(2, size=PATTERNSIZE)-1)\n    # Building the test pattern, partially zero'ed out, that the network will have to complete\n    testpattern = random.choice(patterns).copy()\n    degradedtestpattern = testpattern * np.random.randint(2, size=PATTERNSIZE)\n    # Inserting the inputs in the input tensor at the proper places\n    for nc in range(NBPRESCYCLES):\n        np.random.shuffle(patterns)\n        for ii in range(NBPATTERNS):\n            for nn in range(PRESTIME):\n                numi =nc * (NBPATTERNS * (PRESTIME+INTERPRESDELAY)) + ii * (PRESTIME+INTERPRESDELAY) + nn\n                inputT[numi][0][:] = patterns[ii][:]\n    # Inserting the degraded pattern\n    for nn in range(PRESTIMETEST):\n        inputT[-PRESTIMETEST + nn][0][:] = degradedtestpattern[:]\n    inputT = 20.0 * torch.from_numpy(inputT.astype(np.float32))  # Convert from numpy to Tensor\n    target = torch.from_numpy(testpattern.astype(np.float32))\n    return inputT, target\n\ntotal_loss = 0.0; all_losses = []\nnowtime = time.time()\n\n\n# === Actual algorithm ===\n# Note that each column of w and alpha defines the inputs to a single neuron\nw = Variable(.01 * torch.randn(NBNEUR, NBNEUR), requires_grad=True) # Fixed weights\nalpha = Variable(.01 * torch.randn(NBNEUR, NBNEUR), requires_grad=True) # Plasticity coeffs.\noptimizer = torch.optim.Adam([w, alpha], lr=3e-4)\n\nprint(\"Starting episodes...\")\nfor numiter in range(1000): # Loop over episodes\n    y = Variable(torch.zeros(1, NBNEUR)) # Initialize neuron activations\n    hebb = Variable(torch.zeros(NBNEUR, NBNEUR)) # Initialize Hebbian trace\n    inputs, target = generateInputsAndTarget() # Generate inputs & target for this episode\n    optimizer.zero_grad()\n    # Run the episode:\n    for numstep in range(NBSTEPS):\n        yout = F.tanh( y.mm(w + torch.mul(alpha, hebb)) +\n                Variable(inputs[numstep], requires_grad=False) )\n        hebb = .99 * hebb + .01 * torch.ger(y[0], yout[0]) # torch.ger = Outer product\n        y = yout\n    # Episode done, now compute loss, apply backpropagation\n    loss = (y[0] - Variable(target, requires_grad=False)).pow(2).sum()\n    loss.backward()\n    optimizer.step()\n\n# === End of actual algorithm ===\n\n\n    # Print statistics\n    print_every = 10\n    to = target.cpu().numpy(); yo = y.data.cpu().numpy()[0][:]\n    z = (np.sign(yo) != np.sign(to)); lossnum = np.mean(z)  # Compute error rate\n    total_loss  += lossnum\n    if (numiter+1) % print_every == 0:\n        previoustime = nowtime;  nowtime = time.time()\n        print(\"Episode\", numiter, \"=== Time spent on last\", print_every, \"iters: \", nowtime - previoustime)\n        print(target.cpu().numpy()[-10:])   # Target pattern to be reconstructed\n        print(inputs.cpu().numpy()[numstep][0][-10:])  # Last input (degraded pattern)\n        print(y.data.cpu().numpy()[0][-10:])   # Final output of the network\n        total_loss /= print_every\n        print(\"Mean error rate over last\", print_every, \"iters:\", total_loss, \"\\n\")\n        total_loss = 0\n\n\n"
  },
  {
    "path": "simplemaze/README.md",
    "content": "# Simple code for the grid maze task.\n\nThis code is a deliberately simplified version of the `maze` experiment. The\ncode is made as simple as possible, with copious comments.\n\n\n##Usage\n\nTo run the program, just run `python3 maze.py`.\nDefault parameters should be able to meta-learn the task.\n\n## Backpropamine network\n\nThe `Network` class in `maze/maze.py` implements a Backpropamine recurrent\nnetwork, that is, a fully-connected recurrent neural network with\nneuromodulated Hebbian plastic connections that is trained by gradient descent.  \n\nHere is the full code for the `Network` class, which contains the entire machinery for Backpropamine (note that it only contains ~25 lines of code).\n\n```python\nclass Network(nn.Module):\n    \n    def __init__(self, isize, hsize): \n        super(Network, self).__init__()\n        self.hsize, self.isize  = hsize, isize \n\n        self.i2h = torch.nn.Linear(isize, hsize)    # Weights from input to recurrent layer\n        self.w =  torch.nn.Parameter(.001 * torch.rand(hsize, hsize))   # Baseline (\"fixed\") component of the plastic recurrent layer\n        \n        self.alpha =  torch.nn.Parameter(.001 * torch.rand(hsize, hsize))   # Plasticity coefficients of the plastic recurrent layer; one alpha coefficient per recurrent connection\n\n        self.h2mod = torch.nn.Linear(hsize, 1)      # Weights from the recurrent layer to the (single) neurodulator output\n        self.modfanout = torch.nn.Linear(1, hsize)  # The modulator output is passed through a different 'weight' for each neuron (it 'fans out' over neurons)\n\n        self.h2o = torch.nn.Linear(hsize, NBACTIONS)    # From recurrent to outputs (action probabilities)\n        self.h2v = torch.nn.Linear(hsize, 1)            # From recurrent to value-prediction (used for A2C)\n\n\n        \n    def forward(self, inputs, hidden): # hidden is a tuple containing h-state and the hebbian trace \n            HS = self.hsize\n        \n            # hidden[0] is the h-state; hidden[1] is the Hebbian trace\n            hebb = hidden[1]\n\n\n            # Each *column* of w, alpha and hebb contains the inputs weights to a single neuron\n            hactiv = torch.tanh( self.i2h(inputs) + hidden[0].unsqueeze(1).bmm(self.w + torch.mul(self.alpha, hebb)).squeeze(1)  )\n            activout = self.h2o(hactiv)  # Pure linear, raw scores - to be softmaxed later, outside the function\n            valueout = self.h2v(hactiv)\n\n            # Now computing the Hebbian updates...\n            deltahebb = torch.bmm(hidden[0].unsqueeze(2), hactiv.unsqueeze(1))  # Batched outer product of previous hidden state with new hidden state\n            \n            # We also need to compute the eta (the plasticity rate), wich is determined by neuromodulation\n            myeta = F.tanh(self.h2mod(hactiv)).unsqueeze(2)  # Shape: BatchSize x 1 x 1\n            \n            # The neuromodulated eta is passed through a vector of fanout weights, one per neuron.\n            # Each *column* in w, hebb and alpha constitutes the inputs to a single cell\n            # For w and alpha, columns are 2nd dimension (i.e. dim 1); for hebb, it's dimension 2 (dimension 0 is batch)\n            # The output of the following line has shape BatchSize x 1 x NHidden, i.e. 1 line and NHidden columns for each \n            # batch element. When multiplying by hebb (BatchSize x NHidden x NHidden), broadcasting will provide a different\n            # value for each cell but the same value for all inputs of a cell, as required by fanout concept.\n            myeta = self.modfanout(myeta) \n            \n            \n            # Updating Hebbian traces, with a hard clip (other choices are possible)\n            self.clipval = 2.0\n            hebb = torch.clamp(hebb + myeta * deltahebb, min=-self.clipval, max=self.clipval)\n\n            hidden = (hactiv, hebb)\n            return activout, valueout, hidden\n\n\n\n    def initialZeroHebb(self, BATCHSIZE):\n        return Variable(torch.zeros(BATCHSIZE, self.hsize, self.hsize) , requires_grad=False)\n\n    def initialZeroState(self, BATCHSIZE):\n        return Variable(torch.zeros(BATCHSIZE, self.hsize), requires_grad=False )\n\n```\n\n\nThe rest of the code implements a simple\nA2C algorithm to train the network for the Grid Maze task.\n\n## Copyright and licensing information\n\nCopyright (c) 2018-2019 Uber Technologies, Inc.\n\nLicensed under the Uber Non-Commercial License (the \"License\");\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at the root directory of this project. \n\nSee the License file in this repository for the specific language governing \npermissions and limitations under the License.\n\n"
  },
  {
    "path": "simplemaze/maze.py",
    "content": "# Backpropamine: differentiable neuromdulated plasticity.\n#\n# Copyright (c) 2018-2019 Uber Technologies, Inc.\n#\n# Licensed under the Uber Non-Commercial License (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at the root directory of this project. \n#\n# See the License file in this repository for the specific language governing \n# permissions and limitations under the License.\n\n# This code implements the \"Grid Maze\" task. See section 4.2 in\n# Miconi et al. ICLR 2019 ( https://openreview.net/pdf?id=r1lrAiA5Ym )\n# or section 4.5 in Miconi et al. \n# ICML 2018 ( https://arxiv.org/abs/1804.02464 )\n\n\n# The Network class implements a \"backpropamine\" network, that is, a neural\n# network with neuromodulated Hebbian plastic connections that is trained by\n# gradient descent. The Backpropamine machinery is\n# entirely contained in the Network class (~25 lines of code). \n\n# The rest of the code implements a simple\n# A2C algorithm to train the network for the Grid Maze task.\n\n\nimport argparse\nimport pdb\n#from line_profiler import LineProfiler\nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\nimport numpy as np\nfrom numpy import random\nimport torch.nn.functional as F\nfrom torch import optim\nfrom torch.optim import lr_scheduler\nimport random\nimport sys\nimport pickle\nimport time\nimport os\nimport platform\n\nimport numpy as np\n\n\n\n\n\nnp.set_printoptions(precision=4)\n\n\nADDITIONALINPUTS = 4 # 1 input for the previous reward, 1 input for numstep, 1 unused,  1 \"Bias\" input\n\nNBACTIONS = 4   # U, D, L, R\n\nRFSIZE = 3      # Receptive Field: RFSIZE x RFSIZE\n\nTOTALNBINPUTS =  RFSIZE * RFSIZE + ADDITIONALINPUTS + NBACTIONS\n\n\n\n# RNN with trainable modulated plasticity (\"backpropamine\")\nclass Network(nn.Module):\n    \n    def __init__(self, isize, hsize): \n        super(Network, self).__init__()\n        self.hsize, self.isize  = hsize, isize \n\n        self.i2h = torch.nn.Linear(isize, hsize)    # Weights from input to recurrent layer\n        self.w =  torch.nn.Parameter(.001 * torch.rand(hsize, hsize))   # Baseline (non-plastic) component of the plastic recurrent layer\n        \n        self.alpha =  torch.nn.Parameter(.001 * torch.rand(hsize, hsize))   # Plasticity coefficients of the plastic recurrent layer; one alpha coefficient per recurrent connection\n        #self.alpha = torch.nn.Parameter(.0001 * torch.rand(1,1,hsize))  # Per-neuron alpha\n        #self.alpha = torch.nn.Parameter(.0001 * torch.ones(1))         # Single alpha for whole network\n\n        self.h2mod = torch.nn.Linear(hsize, 1)      # Weights from the recurrent layer to the (single) neurodulator output\n        self.modfanout = torch.nn.Linear(1, hsize)  # The modulator output is passed through a different 'weight' for each neuron (it 'fans out' over neurons)\n\n        self.h2o = torch.nn.Linear(hsize, NBACTIONS)    # From recurrent to outputs (action probabilities)\n        self.h2v = torch.nn.Linear(hsize, 1)            # From recurrent to value-prediction (used for A2C)\n\n\n        \n    def forward(self, inputs, hidden): # hidden is a tuple containing the h-state (i.e. the recurrent hidden state) and the hebbian trace \n            HS = self.hsize\n        \n            # hidden[0] is the h-state; hidden[1] is the Hebbian trace\n            hebb = hidden[1]\n\n\n            # Each *column* of w, alpha and hebb contains the inputs weights to a single neuron\n            hactiv = torch.tanh( self.i2h(inputs) + hidden[0].unsqueeze(1).bmm(self.w + torch.mul(self.alpha, hebb)).squeeze(1)  )  # Update the h-state\n            activout = self.h2o(hactiv)  # Pure linear, raw scores - to be softmaxed later, outside the function\n            valueout = self.h2v(hactiv)\n\n            # Now computing the Hebbian updates...\n            deltahebb = torch.bmm(hidden[0].unsqueeze(2), hactiv.unsqueeze(1))  # Batched outer product of previous hidden state with new hidden state\n            \n            # We also need to compute the eta (the plasticity rate), wich is determined by neuromodulation\n            # Note that this is \"simple\" neuromodulation.\n            myeta = F.tanh(self.h2mod(hactiv)).unsqueeze(2)  # Shape: BatchSize x 1 x 1\n            \n            # The neuromodulated eta is passed through a vector of fanout weights, one per neuron.\n            # Each *column* in w, hebb and alpha constitutes the inputs to a single cell.\n            # For w and alpha, columns are 2nd dimension (i.e. dim 1); for hebb, it's dimension 2 (dimension 0 is batch)\n            # The output of the following line has shape BatchSize x 1 x NHidden, i.e. 1 line and NHidden columns for each \n            # batch element. When multiplying by hebb (BatchSize x NHidden x NHidden), broadcasting will provide a different\n            # value for each cell but the same value for all inputs of a cell, as required by fanout concept.\n            myeta = self.modfanout(myeta) \n            \n            \n            # Updating Hebbian traces, with a hard clip (other choices are possible)\n            self.clipval = 2.0\n            hebb = torch.clamp(hebb + myeta * deltahebb, min=-self.clipval, max=self.clipval)\n\n            hidden = (hactiv, hebb)\n            return activout, valueout, hidden\n\n\n\n\n    def initialZeroState(self, BATCHSIZE):\n        return Variable(torch.zeros(BATCHSIZE, self.hsize), requires_grad=False )\n\n    # In plastic networks, we must also initialize the Hebbian state:\n    def initialZeroHebb(self, BATCHSIZE):\n        return Variable(torch.zeros(BATCHSIZE, self.hsize, self.hsize) , requires_grad=False)\n\n\n\n# That's it for plasticity! The rest of the code simply implements the maze task and the A2C RL algorithm.\n\n\n\n\ndef train(paramdict):\n    #params = dict(click.get_current_context().params)\n\n    #TOTALNBINPUTS =  RFSIZE * RFSIZE + ADDITIONALINPUTS + NBNONRESTACTIONS\n    print(\"Starting training...\")\n    params = {}\n    #params.update(defaultParams)\n    params.update(paramdict)\n    print(\"Passed params: \", params)\n    print(platform.uname())\n    #params['nbsteps'] = params['nbshots'] * ((params['prestime'] + params['interpresdelay']) * params['nbclasses']) + params['prestimetest']  # Total number of steps per episode\n    suffix = \"btchFixmod_\"+\"\".join([str(x)+\"_\" if pair[0] is not 'nbsteps' and pair[0] is not 'rngseed' and pair[0] is not 'save_every' and pair[0] is not 'test_every' and pair[0] is not 'pe' else '' for pair in sorted(zip(params.keys(), params.values()), key=lambda x:x[0] ) for x in pair])[:-1] + \"_rngseed_\" + str(params['rngseed'])   # Turning the parameters into a nice suffix for filenames\n\n    # Initialize random seeds (first two redundant?)\n    print(\"Setting random seeds\")\n    np.random.seed(params['rngseed']); random.seed(params['rngseed']); torch.manual_seed(params['rngseed'])\n    #print(click.get_current_context().params)\n\n    print(\"Initializing network\")\n    use_cuda = torch.cuda.is_available()\n    device = torch.device(\"cuda\" if use_cuda else \"cpu\")\n    \n    net = Network(TOTALNBINPUTS, params['hs']).to(device)  # Creating the network\n    \n    print (\"Shape of all optimized parameters:\", [x.size() for x in net.parameters()])\n    allsizes = [torch.numel(x.data.cpu()) for x in net.parameters()]\n    print (\"Size (numel) of all optimized elements:\", allsizes)\n    print (\"Total size (numel) of all optimized elements:\", sum(allsizes))\n\n    #total_loss = 0.0\n    print(\"Initializing optimizer\")\n    optimizer = torch.optim.Adam(net.parameters(), lr=1.0*params['lr'], eps=1e-4, weight_decay=params['l2'])\n    #optimizer = torch.optim.SGD(net.parameters(), lr=1.0*params['lr'])\n    #scheduler = torch.optim.lr_scheduler.StepLR(optimizer, gamma=params['gamma'], step_size=params['steplr'])\n\n\n    BATCHSIZE = params['bs']\n\n    LABSIZE = params['msize'] \n    lab = np.ones((LABSIZE, LABSIZE))\n    CTR = LABSIZE // 2 \n\n\n    # Grid maze\n    lab[1:LABSIZE-1, 1:LABSIZE-1].fill(0)\n    for row in range(1, LABSIZE - 1):\n        for col in range(1, LABSIZE - 1):\n            if row % 2 == 0 and col % 2 == 0:\n                lab[row, col] = 1\n    # Not strictly necessary, but cleaner since we start the agent at the\n    # center for each episode; may help loclization in some maze sizes\n    # (including 13 and 9, but not 11) by introducing a detectable irregularity\n    # in the center:\n    lab[CTR,CTR] = 0 \n\n\n\n    all_losses = []\n    all_grad_norms = []\n    all_losses_objective = []\n    all_total_rewards = []\n    all_losses_v = []\n    lossbetweensaves = 0\n    nowtime = time.time()\n    meanrewards = np.zeros((LABSIZE, LABSIZE))\n    meanrewardstmp = np.zeros((LABSIZE, LABSIZE, params['eplen']))\n\n\n    pos = 0\n    hidden = net.initialZeroState(BATCHSIZE)\n    hebb = net.initialZeroHebb(BATCHSIZE)\n    #pw = net.initialZeroPlasticWeights()  # For eligibility traces\n\n    #celoss = torch.nn.CrossEntropyLoss() # For supervised learning - not used here\n\n\n    print(\"Starting episodes!\")\n\n    for numiter in range(params['nbiter']):\n\n        PRINTTRACE = 0\n        #if (numiter+1) % (1 + params['pe']) == 0:\n        if (numiter+1) % (params['pe']) == 0:\n            PRINTTRACE = 1\n\n        #lab = makemaze.genmaze(size=LABSIZE, nblines=4)\n        #count = np.zeros((LABSIZE, LABSIZE))\n\n        # Select the reward location for this episode - not on a wall!\n        # And not on the center either! (though not sure how useful that restriction is...)\n        # We always start the episode from the center \n        posr = {}; posc = {}\n        rposr = {}; rposc = {}\n        for nb in range(BATCHSIZE):\n            # Note: it doesn't matter if the reward is on the center (see below). All we need is not to put it on a wall or pillar (lab=1)\n            myrposr = 0; myrposc = 0\n            while lab[myrposr, myrposc] == 1 or (myrposr == CTR and myrposc == CTR):\n                myrposr = np.random.randint(1, LABSIZE - 1)\n                myrposc = np.random.randint(1, LABSIZE - 1)\n            rposr[nb] = myrposr; rposc[nb] = myrposc\n            #print(\"Reward pos:\", rposr, rposc)\n            # Agent always starts an episode from the center\n            posc[nb] = CTR\n            posr[nb] = CTR\n\n        optimizer.zero_grad()\n        loss = 0\n        lossv = 0\n        hidden = net.initialZeroState(BATCHSIZE).to(device)\n        hebb = net.initialZeroHebb(BATCHSIZE).to(device)\n        numactionchosen = 0\n\n\n        reward = np.zeros(BATCHSIZE)\n        sumreward = np.zeros(BATCHSIZE)\n        rewards = []\n        vs = []\n        logprobs = []\n        dist = 0\n        numactionschosen = np.zeros(BATCHSIZE, dtype='int32')\n\n        #reloctime = np.random.randint(params['eplen'] // 4, (3 * params['eplen']) // 4)\n\n        #print(\"EPISODE \", numiter)\n        for numstep in range(params['eplen']):\n\n\n\n            inputs = np.zeros((BATCHSIZE, TOTALNBINPUTS), dtype='float32') \n        \n            labg = lab.copy()\n            for nb in range(BATCHSIZE):\n                inputs[nb, 0:RFSIZE * RFSIZE] = labg[posr[nb] - RFSIZE//2:posr[nb] + RFSIZE//2 +1, posc[nb] - RFSIZE //2:posc[nb] + RFSIZE//2 +1].flatten() * 1.0\n                \n                # Previous chosen action\n                inputs[nb, RFSIZE * RFSIZE +1] = 1.0 # Bias neuron\n                inputs[nb, RFSIZE * RFSIZE +2] = numstep / params['eplen']\n                inputs[nb, RFSIZE * RFSIZE +3] = 1.0 * reward[nb]\n                inputs[nb, RFSIZE * RFSIZE + ADDITIONALINPUTS + numactionschosen[nb]] = 1\n            \n            inputsC = torch.from_numpy(inputs).to(device)\n\n            ## Running the network\n            y, v, (hidden, hebb) = net(inputsC, (hidden, hebb))  # y  should output raw scores, not probas\n\n\n            y = torch.softmax(y, dim=1)\n            distrib = torch.distributions.Categorical(y)\n            actionschosen = distrib.sample()  \n            logprobs.append(distrib.log_prob(actionschosen))\n            numactionschosen = actionschosen.data.cpu().numpy()  # We want to break gradients\n            reward = np.zeros(BATCHSIZE, dtype='float32')\n\n\n            for nb in range(BATCHSIZE):\n                myreward = 0\n                numactionchosen = numactionschosen[nb]\n\n                tgtposc = posc[nb]\n                tgtposr = posr[nb]\n                if numactionchosen == 0:  # Up\n                    tgtposr -= 1\n                elif numactionchosen == 1:  # Down\n                    tgtposr += 1\n                elif numactionchosen == 2:  # Left\n                    tgtposc -= 1\n                elif numactionchosen == 3:  # Right\n                    tgtposc += 1\n                else:\n                    raise ValueError(\"Wrong Action\")\n                \n                reward[nb] = 0.0  # The reward for this step\n                if lab[tgtposr][tgtposc] == 1:\n                    reward[nb] -= params['wp']\n                else:\n                    posc[nb] = tgtposc\n                    posr[nb] = tgtposr\n\n                # Did we hit the reward location ? Increase reward and teleport!\n                # Note that it doesn't matter if we teleport onto the reward, since reward hitting is only evaluated after the (obligatory) move...\n                # But we still avoid it.\n                if rposr[nb] == posr[nb] and rposc[nb] == posc[nb]:\n                    reward[nb] += params['rew']\n                    posr[nb]= np.random.randint(1, LABSIZE - 1)\n                    posc[nb] = np.random.randint(1, LABSIZE - 1)\n                    while lab[posr[nb], posc[nb]] == 1 or (rposr[nb] == posr[nb] and rposc[nb] == posc[nb]):\n                        posr[nb] = np.random.randint(1, LABSIZE - 1)\n                        posc[nb] = np.random.randint(1, LABSIZE - 1)\n\n            rewards.append(reward)\n            vs.append(v)\n            sumreward += reward\n\n            # This is an \"entropy penalty\", implemented by the sum-of-squares of the probabilities because our version of PyTorch did not have an entropy() function.\n            # The result is the same: to penalize concentration, i.e. encourage diversity in chosen actions.\n            loss += ( params['bent'] * y.pow(2).sum() / BATCHSIZE )  \n\n\n            #if PRINTTRACE:\n            #    print(\"Step \", numstep, \" Inputs (to 1st in batch): \", inputs[0, :TOTALNBINPUTS], \" - Outputs(1st in batch): \", y[0].data.cpu().numpy(), \" - action chosen(1st in batch): \", numactionschosen[0],\n            #            #\" - mean abs pw: \", np.mean(np.abs(pw.data.cpu().numpy())), \n            #            \" -Reward (this step, 1st in batch): \", reward[0])\n\n\n\n        # Episode is done, now let's do the actual computations of rewards and losses for the A2C algorithm\n\n\n        R = torch.zeros(BATCHSIZE).to(device)\n        gammaR = params['gr']\n        for numstepb in reversed(range(params['eplen'])) :\n            R = gammaR * R + torch.from_numpy(rewards[numstepb]).to(device)\n            ctrR = R - vs[numstepb][0]\n            lossv += ctrR.pow(2).sum() / BATCHSIZE\n            loss -= (logprobs[numstepb] * ctrR.detach()).sum() / BATCHSIZE  \n            #pdb.set_trace()\n\n\n\n        loss += params['blossv'] * lossv\n        loss /= params['eplen']\n\n        if PRINTTRACE:\n            if True: \n                print(\"lossv: \", float(lossv))\n            print (\"Total reward for this episode (all):\", sumreward, \"Dist:\", dist)\n\n        loss.backward()\n        all_grad_norms.append(torch.nn.utils.clip_grad_norm(net.parameters(), params['gc']))\n        if numiter > 100:  # Burn-in period for meanrewards\n            optimizer.step()\n\n\n        lossnum = float(loss)\n        lossbetweensaves += lossnum\n        all_losses_objective.append(lossnum)\n        all_total_rewards.append(sumreward.mean())\n\n\n        if (numiter+1) % params['pe'] == 0:\n\n            print(numiter, \"====\")\n            print(\"Mean loss: \", lossbetweensaves / params['pe'])\n            lossbetweensaves = 0\n            print(\"Mean reward (across batch and last\", params['pe'], \"eps.): \", np.sum(all_total_rewards[-params['pe']:])/ params['pe'])\n            #print(\"Mean reward (across batch): \", sumreward.mean())\n            previoustime = nowtime\n            nowtime = time.time()\n            print(\"Time spent on last\", params['pe'], \"iters: \", nowtime - previoustime)\n            #print(\"ETA: \", net.eta.data.cpu().numpy(), \" etaet: \", net.etaet.data.cpu().numpy())\n\n        if (numiter+1) % params['save_every'] == 0:\n            print(\"Saving files...\")\n            losslast100 = np.mean(all_losses_objective[-100:])\n            print(\"Average loss over the last 100 episodes:\", losslast100)\n            print(\"Saving local files...\")\n            with open('grad_'+suffix+'.txt', 'w') as thefile:\n                for item in all_grad_norms[::10]:\n                        thefile.write(\"%s\\n\" % item)\n            with open('loss_'+suffix+'.txt', 'w') as thefile:\n                for item in all_total_rewards[::10]:\n                        thefile.write(\"%s\\n\" % item)\n            torch.save(net.state_dict(), 'torchmodel_'+suffix+'.dat')\n            with open('params_'+suffix+'.dat', 'wb') as fo:\n                pickle.dump(params, fo)\n            if os.path.isdir('/mnt/share/tmiconi'):\n                print(\"Transferring to NFS storage...\")\n                for fn in ['params_'+suffix+'.dat', 'loss_'+suffix+'.txt', 'torchmodel_'+suffix+'.dat']:\n                    result = os.system(\n                        'cp {} {}'.format(fn, '/mnt/share/tmiconi/modulmaze/'+fn))\n                print(\"Done!\")\n\n\n\nif __name__ == \"__main__\":\n\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"--rngseed\", type=int, help=\"random seed\", default=0)\n    parser.add_argument(\"--rew\", type=float, help=\"reward value (reward increment for taking correct action after correct stimulus)\", default=10.0)\n    parser.add_argument(\"--wp\", type=float, help=\"penalty for hitting walls\", default=.0)\n    parser.add_argument(\"--bent\", type=float, help=\"coefficient for the entropy reward (really Simpson index concentration measure)\", default=0.03)\n    parser.add_argument(\"--blossv\", type=float, help=\"coefficient for value prediction loss\", default=.1)\n    parser.add_argument(\"--msize\", type=int, help=\"size of the maze; must be odd\", default=11)\n    parser.add_argument(\"--gr\", type=float, help=\"gammaR: discounting factor for rewards\", default=.9)\n    parser.add_argument(\"--gc\", type=float, help=\"gradient norm clipping\", default=4.0)\n    parser.add_argument(\"--lr\", type=float, help=\"learning rate (Adam optimizer)\", default=1e-4)\n    parser.add_argument(\"--eplen\", type=int, help=\"length of episodes\", default=200)\n    parser.add_argument(\"--hs\", type=int, help=\"size of the recurrent (hidden) layer\", default=100)\n    parser.add_argument(\"--bs\", type=int, help=\"batch size\", default=30)\n    parser.add_argument(\"--l2\", type=float, help=\"coefficient of L2 norm (weight decay)\", default=0) # 3e-6\n    parser.add_argument(\"--nbiter\", type=int, help=\"number of learning cycles\", default=1000000)\n    parser.add_argument(\"--save_every\", type=int, help=\"number of cycles between successive save points\", default=50)\n    parser.add_argument(\"--pe\", type=int, help=\"number of cycles between successive printing of information\", default=10)\n    args = parser.parse_args(); argvars = vars(args); argdict =  { k : argvars[k] for k in argvars if argvars[k] != None }\n    \n    train(argdict)\n\n\n"
  },
  {
    "path": "sr/.gitignore",
    "content": "tmp/\ntmp/*\n*.txt\n*.dat\n*.swp\n"
  },
  {
    "path": "sr/OpusHdfsCopy.py",
    "content": "import os\nimport os.path\n\ndef checkHdfs():\n    return os.path.isfile('/opt/hadoop/latest/bin/hdfs')\n\ndef transferFileToHdfsPath(sourcepath, targetpath):\n    hdfspath = targetpath\n    targetdir = os.path.dirname(targetpath)\n    os.system('/opt/hadoop/latest/bin/hdfs dfs -mkdir -p {}'.format(targetdir))\n    result = os.system(\n        '/opt/hadoop/latest/bin/hdfs dfs -copyFromLocal -f {} {}'.format(sourcepath, hdfspath)\n    )\n    if result != 0:\n        raise OSError('Cannot copyFromLocal {} {} returned {}'.format(sourcepath, hdfspath, result))\n\ndef transferFileToHdfsDir(sourcepath, targetdir):\n    hdfspath = os.path.join(targetdir, os.path.basename(sourcepath))\n    os.system('/opt/hadoop/latest/bin/hdfs dfs -mkdir -p {}'.format(targetdir))\n    result = os.system(\n        '/opt/hadoop/latest/bin/hdfs dfs -copyFromLocal -f {} {}'.format(sourcepath, hdfspath)\n    )\n    if result != 0:\n        raise OSError('Cannot copyFromLocal {} {} returned {}'.format(sourcepath, hdfspath, result))\n\n"
  },
  {
    "path": "sr/README.md",
    "content": "# Target discovery task\n\nA simple stimulus-response (\"SR\") association task.\n\nAt the start of each episode, we generate four random \"cues\" (i.e. random\nbinary vectors of length 20). One of them is randomly chosen as the \"target\".\nThen, we repeatedly show pairs of cues (randomly chosen among the four) in\nsuccession, and ask the network to specify whether one of these two is the\ntarget. If the network's answer is correct, a reward is issued, otherwise\nnothing happens. The network's task is to obtain as much reward as possible\nduring each episode.\n\nNote that the network must identify the target (from reward information alone),\nthen detect it and respond adequately afterwards. Furthermore, because cues are\nshown in pairs, the target can never be fully identified in a single \"trial\": the\nnetwork is forced to integrate information across successive \"trials\".\n\nThe outer-loop metal-learning algorithm is Advantage Actor critic. All\nwithin-episode learning occurs through the self-modulated plasticity of network\nconnections.\n\nUsage:\n\n`python3 srbatch.py --eplen 120 --hs 200 --lr 1e-4 --l2 0 --pe 500 --bv 0.1 --bent 0.1 --rew 1 --wp 0 --save_every 2000 --type modul --da tanh --clamp 0 --nbiter 200000 --fm 1 --ni 4 --pf .0 --alg A3C --cs 20 --eps 1e-6 --is 0 --bs 30 --gc 2.0 --rngseed 0`\n\n\n`eplen` is the length of an episode, `hs` is the hidden/recurrent layer size, `bs` is batch size and `gc` is gradient clipping.\n`type` can be \"modplast\" (simple neuromodulation), \"modul\" (retroactive modulation), \"plastic\" (non-modulated plasticity) or \"rnn\" (no plasticity at all, plain rnn).\n\nNote that `srbatch.py` implements batch training: the first dimension in the data, the hidden state and the Hebbian traces is a batch dimension.\n"
  },
  {
    "path": "sr/anim.py",
    "content": "import argparse\nimport pdb \nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\nimport numpy as np\nfrom numpy import random\nimport torch.nn.functional as F\nfrom torch import optim\nfrom torch.optim import lr_scheduler\nimport random\nimport sys\nimport pickle\nimport time\nimport os\nimport OpusHdfsCopy\nfrom OpusHdfsCopy import transferFileToHdfsDir, checkHdfs\nimport platform\n\nimport modul\nfrom modul import Network\n\nimport numpy as np\nimport matplotlib.pyplot as plt\nimport matplotlib.animation as animation\nimport glob\n\n\n\n\n\nnp.set_printoptions(precision=4)\n\nETA = .02  # Not used\n\nADDINPUT = 4 # 1 input for the previous reward, 1 input for numstep, 1 for whether currently on reward square, 1 \"Bias\" input\n\nNBACTIONS = 4  # U, D, L, R\n\nRFSIZE = 3 # Receptive Field\n\nTOTALNBINPUTS =  RFSIZE * RFSIZE + ADDINPUT + NBACTIONS\n\n\nfig = plt.figure()\nplt.axis('off')\n\ndef train(paramdict):\n\n    fname = paramdict['file']\n\n    with open(fname, 'rb') as f:\n        params = pickle.load(f)\n\n    #params = dict(click.get_current_context().params)\n    print(\"Passed params: \", params)\n    print(platform.uname())\n    #params['nbsteps'] = params['nbshots'] * ((params['prestime'] + params['interpresdelay']) * params['nbclasses']) + params['prestimetest']  # Total number of steps per episode\n\n    suffix = \"modulmaze_\"+\"\".join([str(x)+\"_\" if pair[0] != 'nbsteps' and pair[0] != 'rngseed' and pair[0] != 'save_every' and pair[0] != 'test_every' else '' for pair in sorted(zip(params.keys(), params.values()), key=lambda x:x[0] ) for x in pair])[:-1] + \"_rngseed_\" + str(params['rngseed'])   # Turning the parameters into a nice suffix for filenames\n\n\n    #params['rngseed'] = 3\n    # Initialize random seeds (first two redundant?)\n    print(\"Setting random seeds\")\n    np.random.seed(params['rngseed']); random.seed(params['rngseed']); torch.manual_seed(params['rngseed'])\n    #print(click.get_current_context().params)\n    \n    net = Network(params)\n    # YOU MAY NEED TO CHANGE THE DIRECTORY HERE:\n    net.load_state_dict(torch.load('./tmp/torchmodel_'+suffix + '.dat'))\n\n\n    print (\"Shape of all optimized parameters:\", [x.size() for x in net.parameters()])\n    allsizes = [torch.numel(x.data.cpu()) for x in net.parameters()]\n    print (\"Size (numel) of all optimized elements:\", allsizes)\n    print (\"Total size (numel) of all optimized elements:\", sum(allsizes))\n\n    LABSIZE = params['msize'] \n    lab = np.ones((LABSIZE, LABSIZE))\n    CTR = LABSIZE // 2 \n\n    # Grid maze\n    lab[1:LABSIZE-1, 1:LABSIZE-1].fill(0)\n    for row in range(1, LABSIZE - 1):\n        for col in range(1, LABSIZE - 1):\n            if row % 2 == 0 and col % 2 == 0:\n                lab[row, col] = 1\n    lab[CTR,CTR] = 0 # Not strictly necessary, but perhaps helps loclization by introducing a detectable irregularity in the center\n\n\n\n    all_losses = []\n    all_losses_objective = []\n    all_total_rewards = []\n    all_losses_v = []\n    lossbetweensaves = 0\n    nowtime = time.time()\n    meanrewards = np.zeros((LABSIZE, LABSIZE))\n    meanrewardstmp = np.zeros((LABSIZE, LABSIZE, params['eplen']))\n\n    pos = 0\n\n\n    \n    params['nbiter'] = 3\n    ax_imgs = []\n    \n    for numiter in range(params['nbiter']):\n\n        PRINTTRACE = 0\n        #if (numiter+1) % (1 + params['print_every']) == 0:\n        if (numiter+1) % (params['print_every']) == 0:\n            PRINTTRACE = 1\n\n        #lab = makemaze.genmaze(size=LABSIZE, nblines=4)\n        #count = np.zeros((LABSIZE, LABSIZE))\n\n        # Select the reward location for this episode - not on a wall!\n        rposr = 0; rposc = 0\n        while lab[rposr, rposc] == 1:\n            rposr = np.random.randint(1, LABSIZE - 1)\n            rposc = np.random.randint(1, LABSIZE - 1)\n\n        # We always start the episode from the center (when hitting reward, we may teleport either to center or to a random location depending on params['rsp'])\n        posc = CTR\n        posr = CTR\n\n        #optimizer.zero_grad()\n        loss = 0\n        lossv = 0\n        hidden = net.initialZeroState()\n        hebb = net.initialZeroHebb()\n        et = net.initialZeroHebb()\n        pw = net.initialZeroPlasticWeights()\n        numactionchosen = 0\n\n\n        reward = 0.0\n        rewards = []\n        vs = []\n        logprobs = []\n        sumreward = 0.0\n        dist = 0\n        \n\n        #print(\"EPISODE \", numiter)\n        for numstep in range(params['eplen']):\n\n\n            if params['clamp'] == 0:\n                inputs = np.zeros((1, TOTALNBINPUTS), dtype='float32') \n            else:\n                inputs = np.zeros((1, params['hs']), dtype='float32')\n        \n            labg = lab.copy()\n            #labg[rposr, rposc] = -1  # The agent can see the reward if it falls within its RF\n            inputs[0, 0:RFSIZE * RFSIZE] = labg[posr - RFSIZE//2:posr + RFSIZE//2 +1, posc - RFSIZE //2:posc + RFSIZE//2 +1].flatten() * 1.0\n            \n            # Previous chosen action\n            inputs[0, RFSIZE * RFSIZE +1] = 1.0 # Bias neuron\n            inputs[0, RFSIZE * RFSIZE +2] = numstep / params['eplen']\n            inputs[0, RFSIZE * RFSIZE +3] = 1.0 * reward # Reward from previous time step\n            inputs[0, RFSIZE * RFSIZE + ADDINPUT + numactionchosen] = 1\n            inputsC = torch.from_numpy(inputs).cuda()\n\n            ## Running the network\n            y, v, hidden, hebb, et, pw = net(Variable(inputsC, requires_grad=False), hidden, hebb, et, pw)  # y  should output raw scores, not probas\n\n\n            y = F.softmax(y, dim=1)\n            # Must convert y to probas to use this !\n            distrib = torch.distributions.Categorical(y)\n            actionchosen = distrib.sample()  # sample() returns a Pytorch tensor of size 1; this is needed for the backprop below\n            numactionchosen = actionchosen.data[0]    # Turn to scalar\n\n            tgtposc = posc\n            tgtposr = posr\n            if numactionchosen == 0:  # Up\n                tgtposr -= 1\n            elif numactionchosen == 1:  # Down\n                tgtposr += 1\n            elif numactionchosen == 2:  # Left\n                tgtposc -= 1\n            elif numactionchosen == 3:  # Right\n                tgtposc += 1\n            else:\n                raise ValueError(\"Wrong Action\")\n            \n            reward = 0.0\n            if lab[tgtposr][tgtposc] == 1:\n                # Hit wall!\n                reward = -params['wp']\n            else:\n                dist += 1\n                posc = tgtposc\n                posr = tgtposr\n            \n            \n            # Display the labyrinth\n\n            #for numr in range(LABSIZE):\n            #    s = \"\"\n            #    for numc in range(LABSIZE):\n            #        if posr == numr and posc == numc:\n            #            s += \"o\"\n            #        elif rposr == numr and rposc == numc:\n            #            s += \"X\"\n            #        elif lab[numr, numc] == 1:\n            #            s += \"#\"\n            #        else:\n            #            s += \" \"\n            #    print(s)\n            #print(\"\")\n            #print(\"\")\n\n            labg = lab.copy()\n            labg[rposr, rposc] = 2\n            labg[posr, posc] = 3\n            fullimg = plt.imshow(labg, animated=True)\n            ax_imgs.append([fullimg])  \n\n\n            # Did we hit the reward location ? Increase reward and teleport!\n            # Note that it doesn't matter if we teleport onto the reward, since reward hitting is only evaluated after the (obligatory) move\n            if rposr == posr and rposc == posc:\n                reward += params['rew']\n                if params['rsp'] == 1:\n                    posr = np.random.randint(1, LABSIZE - 1)\n                    posc = np.random.randint(1, LABSIZE - 1)\n                    while lab[posr, posc] == 1:\n                        posr = np.random.randint(1, LABSIZE - 1)\n                        posc = np.random.randint(1, LABSIZE - 1)\n                else:\n                    posr = CTR\n                    posc = CTR\n\n\n            #if PRINTTRACE:\n            #    #print(\"Step \", numstep, \"- GI: \", goodinput, \", GA: \", goodaction, \" Inputs: \", inputsN, \" - Outputs: \", y.data.cpu().numpy(), \" - action chosen: \", numactionchosen,\n            #    #        \" - inputthisstep:\", inputthisstep, \" - mean abs pw: \", np.mean(np.abs(pw.data.cpu().numpy())), \" -Rew: \", reward)\n            #    print(\"Step \", numstep, \" Inputs: \", inputs[0,:TOTALNBINPUTS], \" - Outputs: \", y.data.cpu().numpy(), \" - action chosen: \", numactionchosen,\n            #            \" - mean abs pw: \", np.mean(np.abs(pw.data.cpu().numpy())), \" -Reward (this step): \", reward)\n            rewards.append(reward)\n            vs.append(v)\n            sumreward += reward\n\n\n\n            logprobs.append(distrib.log_prob(actionchosen))\n\n            #if params['algo'] == 'A3C':\n            loss += params['bentropy'] * y.pow(2).sum()   # We want to penalize concentration, i.e. encourage diversity; our version of PyTorch does not have an entropy() function for Distribution, so we use this instead.\n\n            ##if PRINTTRACE:\n            ##    print(\"Probabilities:\", y.data.cpu().numpy(), \"Picked action:\", numactionchosen, \", got reward\", reward)\n\n\n        # Episode is done, now let's do the actual computations\n        gammaR = params['gr']\n        if True: #params['algo'] == 'A3C':\n            R = 0\n            for numstepb in reversed(range(params['eplen'])) :\n                R = gammaR * R + rewards[numstepb]\n                lossv += (vs[numstepb][0] - R).pow(2)\n                loss -= logprobs[numstepb] * (R - vs[numstepb].data[0][0])  # Not sure if the \"data\" is needed... put it b/c of worry about weird gradient flows\n            loss += params['blossv'] * lossv\n\n        #elif params['algo'] == 'REI':\n        #    R = sumreward\n        #    baseline = meanrewards[rposr, rposc]\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        loss -= logprobs[numstepb] * (R - baseline)\n        #elif params['algo'] == 'REINOB':\n        #    R = sumreward\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        loss -= logprobs[numstepb] * R\n        #elif params['algo'] == 'REITMP':\n        #    R = 0\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        R = gammaR * R + rewards[numstepb]\n        #        loss -= logprobs[numstepb] * R\n        #elif params['algo'] == 'REITMPB':\n        #    R = 0\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        R = gammaR * R + rewards[numstepb]\n        #        loss -= logprobs[numstepb] * (R - meanrewardstmp[rposr, rposc, numstepb])\n\n        #else:\n        #    raise ValueError(\"Which algo?\")\n\n        meanrewards[rposr, rposc] = (1.0 - params['nu']) * meanrewards[rposr, rposc] + params['nu'] * sumreward\n        R = 0\n        for numstepb in reversed(range(params['eplen'])) :\n            R = gammaR * R + rewards[numstepb]\n            meanrewardstmp[rposr, rposc, numstepb] = (1.0 - params['nu']) * meanrewardstmp[rposr, rposc, numstepb] + params['nu'] * R\n\n        loss /= params['eplen']\n\n        if True: #PRINTTRACE:\n            if True: #params['algo'] == 'A3C':\n                print(\"lossv: \", lossv.data.cpu().numpy()[0])\n            print (\"Total reward for this episode:\", sumreward, \"Dist:\", dist)\n\n        #if numiter > 100:  # Burn-in period for meanrewards\n        #    loss.backward()\n        #    optimizer.step()\n\n        #torch.cuda.empty_cache()\n\n        #print(sumreward)\n        lossnum = loss.data[0]\n        lossbetweensaves += lossnum\n        all_losses_objective.append(lossnum)\n        all_total_rewards.append(sumreward)\n            #all_losses_v.append(lossv.data[0])\n        #total_loss  += lossnum\n\n\n        if True: #PRINTTRACE:\n            print(\"lossv: \", lossv.data.cpu().numpy()[0])\n            print (\"Total reward for this episode:\", sumreward, \"Dist:\", dist)\n\n\n    print(\"Saving animation....\")\n    anim = animation.ArtistAnimation(fig, ax_imgs, interval=200)\n    anim.save('anim.gif', writer='imagemagick', fps=10)\n\n\n\nif __name__ == \"__main__\":\n#defaultParams = {\n#    'type' : 'lstm',\n#    'seqlen' : 200,\n#    'hiddensize': 500,\n#    'activ': 'tanh',\n#    'steplr': 10e9,  # By default, no change in the learning rate\n#    'gamma': .5,  # The annealing factor of learning rate decay for Adam\n#    'imagesize': 31,    \n#    'nbiter': 30000,  \n#    'lr': 1e-4,   \n#    'test_every': 10,\n#    'save_every': 3000,\n#    'rngseed':0\n#}\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"--file\", help=\"params file\")\n    args = parser.parse_args(); argvars = vars(args); argdict =  { k : argvars[k] for k in argvars if argvars[k] != None }\n    train(argdict)\n\n"
  },
  {
    "path": "sr/makefigure.py",
    "content": "import numpy as np\nimport glob\nimport matplotlib.pyplot as plt\nimport scipy\nfrom scipy import stats\n\ncolorz = ['g', 'orange', 'r', 'b', 'c', 'm', 'y', 'k']\n\n\n\ngroupnames = glob.glob('./tmp/loss_SRB_addpw_2_alg_A3C_bent_0.1_blossv_0.1_bs_30_bv_0.1_clamp_0_cs_20_da_tanh_eplen_120_eps_1e-06_fm_1_gc_2.0_gr_0.9_hs_200_is_0_l2_0.0_lr_0.0001_nbiter_200000_ni_4_nu_0.1_pe_500_pf_0.0_rew_1.0_rule_hebb_type_*_wp_0.0_rngseed_0.txt')\n\n#Previous:\n#groupnames = glob.glob('./tmp8/loss_*eplen_251*densize_200*absize_11_*ndstart_1*rngseed_1.txt')  \n#groupnames = glob.glob('./tmp8/loss_*eplen_251*densize_200*absize_11_*ndstart_1*rngseed_1.txt')  \n\n\n#groupnames = glob.glob('./tmp/loss_*new*eplen_251*rngseed_0.txt')  \n#groupnames = glob.glob('./tmp/loss_*new*eplen_250*rngseed_0.txt')  \n\nplt.rc('font', size=14)\n\n\ndef my_mavg(x, N):\n  cumsum = np.cumsum(np.insert(x, 0, 0)) \n  return (cumsum[N:] - cumsum[:-N]) / N\n\nplt.ion()\n#plt.figure(figsize=(5,4))  # Smaller figure = relative larger fonts\nplt.figure()\n\nallmedianls = []\nalllosses = []\nposcol = 0\nminminlen = 999999\nfor numgroup, groupname in enumerate(groupnames):\n    if \"lstm\" in groupname:\n        continue\n    g = groupname[:-6]+\"*\"\n    print(\"====\", groupname)\n    fnames = glob.glob(g)\n    fulllosses=[]\n    losses=[]\n    lgts=[]\n    for fn in fnames:\n        if True:\n            if \"seed_11\" in fn:\n                continue\n            if \"seed_12\" in fn:\n                continue\n            if \"seed_13\" in fn:\n                continue\n            if \"seed_14\" in fn:\n                continue\n            if \"seed_15\" in fn:\n                continue\n\n\n        z = np.loadtxt(fn)\n        \n        z = z[::10] # Decimation - speed things up!\n        #z = my_mavg(z, 20)  # For each run, we average the losses over K successive (decimated) episodes - otherwise figure is unreadable due to noise!\n\n\n        z = z[:1801]\n        \n        #if len(z) < 9000:\n        #    print(fn)\n        #    continue\n        #z = z[:90]\n        lgts.append(len(z))\n        fulllosses.append(z)\n    minlen = min(lgts)\n    if minlen < minminlen:\n        minminlen = minlen\n    print(minlen)\n    #if minlen < 1000:\n    #    continue\n    for z in fulllosses:\n        losses.append(z[:minlen])\n\n    losses = np.array(losses)\n    alllosses.append(losses)\n    \n    meanl = np.mean(losses, axis=0)\n    stdl = np.std(losses, axis=0)\n    #cil = stdl / np.sqrt(losses.shape[0]) * 1.96  # 95% confidence interval - assuming normality\n    cil = stdl / np.sqrt(losses.shape[0]) * 2.5  # 95% confidence interval - approximated with the t-distribution for 7 d.f. (?)\n\n    medianl = np.median(losses, axis=0)\n    allmedianls.append(medianl)\n    q1l = np.percentile(losses, 25, axis=0)\n    q3l = np.percentile(losses, 75, axis=0)\n    \n    highl = np.max(losses, axis=0)\n    lowl = np.min(losses, axis=0)\n    #highl = meanl+stdl\n    #lowl = meanl-stdl\n\n    xx = range(len(meanl))\n\n    # xticks and labels\n    #xt = range(0, len(meanl), 2000)\n    xt = range(0, 1801, 500)\n    xtl = [str(10 * 10 * i) for i in xt]   # Because of decimation above, and only every 10th loss is recorded in the files\n\n    if \"plastic\" in groupname:\n        lbl = \"Non-modulated plastic\"\n    elif \"modplast\" in groupname:\n        lbl = \"Simple modulation\"\n    elif \"modul\" in groupname:\n        lbl = \"Retroactive modulation\"\n    elif \"rnn\" in groupname:\n        lbl = \"Non-plastic\"\n    else:\n        raise ValueError(\"Which type?\")\n\n    #plt.plot(my_mavg(meanl, 100), label=g) #, color='blue')\n    #plt.fill_between(xx, lowl, highl,  alpha=.2)\n    #plt.fill_between(xx, q1l, q3l,  alpha=.1)\n    #plt.plot(meanl) #, color='blue')\n    ####plt.plot(my_mavg(medianl, 100), label=g) #, color='blue')  # my_mavg changes the number of points !\n    #plt.plot(my_mavg(q1l, 100), label=g, alpha=.3) #, color='blue')\n    #plt.plot(my_mavg(q3l, 100), label=g, alpha=.3) #, color='blue')\n    #plt.fill_between(xx, q1l, q3l,  alpha=.2)\n    #plt.plot(medianl, label=g) #, color='blue')\n   \n    AVGSIZE = 20\n    \n    xlen = len(my_mavg(q1l, AVGSIZE))\n    plt.fill_between( range(xlen), my_mavg(q1l, AVGSIZE), my_mavg(q3l, AVGSIZE),  alpha=.2, color=colorz[poscol % len(colorz)])\n    plt.plot(my_mavg(medianl, AVGSIZE), color=colorz[poscol % len(colorz)], label=lbl)  # my_mavg changes the number of points !\n    \n    #xlen = len(my_mavg(meanl, AVGSIZE))\n    #plt.plot(my_mavg(meanl, AVGSIZE), label=g, color=colorz[poscol % len(colorz)])  # my_mavg changes the number of points !\n    #plt.fill_between( range(xlen), my_mavg(meanl - cil, AVGSIZE), my_mavg(meanl + cil, AVGSIZE),  alpha=.2, color=colorz[poscol % len(colorz)])\n    \n    poscol += 1\n    \n    #plt.fill_between( range(xlen), my_mavg(lowl, 100), my_mavg(highl, 100),  alpha=.2, color=colorz[numgroup % len(colorz)])\n\n    #plt.plot(my_mavg(losses[0], 1000), label=g, color=colorz[numgroup % len(colorz)])\n    #for curve in losses[1:]:\n    #    plt.plot(my_mavg(curve, 1000), color=colorz[numgroup % len(colorz)])\n\nps = []\n# Adapt for varying lengths across groups\n#for n in range(0, alllosses[0].shape[1], 3):\nfor n in range(0, minminlen):\n    ps.append(scipy.stats.ranksums(alllosses[0][:,n], alllosses[1][:,n]).pvalue)\nps = np.array(ps)\nprint(np.mean(ps[-500:] < .05), np.mean(ps[-500:] < .01))\n\nplt.legend(loc='best', fontsize=14)\n#plt.xlabel('Loss (sum square diff. b/w final output and target)')\nplt.xlabel('Number of Episodes')\nplt.ylabel('Reward')\nplt.xticks(xt, xtl)\n#plt.tight_layout()\n\n\n\n"
  },
  {
    "path": "sr/modul.py",
    "content": "import pdb\nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\nimport numpy as np\nimport torch.nn.functional as F\n\n\n\n\n##ttype = torch.FloatTensor;\n#ttype = torch.cuda.FloatTensor;\n\n#ttype = torch.FloatTensor;\n#ttype = torch.cuda.FloatTensor;\n\n\n\nclass NonPlasticRNN(nn.Module):\n    def __init__(self, params):\n        super(NonPlasticRNN, self).__init__()\n        # NOTE: 'outputsize' excludes the value and neuromodulator outputs!\n        for paramname in ['outputsize', 'inputsize', 'hs', 'bs', 'fm']:\n            if paramname not in params.keys():\n                raise KeyError(\"Must provide missing key in argument 'params': \"+paramname)\n        NBDA = 1 # For now we limit the number of neuromodulatory-output neurons to 1\n        # Doesn't work with our version of PyTorch:\n        #self.device = torch.device(\"cuda:0\" if self.params['device'] == 'gpu' else \"cpu\")\n        self.params = params\n        self.activ = F.tanh\n        self.i2h = torch.nn.Linear(self.params['inputsize'], params['hs']).cuda()\n        self.w =  torch.nn.Parameter((.01 * torch.t(torch.rand(params['hs'], params['hs']))).cuda(), requires_grad=True) \n        self.h2o = torch.nn.Linear(params['hs'], self.params['outputsize']).cuda()\n        self.h2v = torch.nn.Linear(params['hs'], 1).cuda()\n\n\n    def forward(self, inputs, hidden): #, hebb):\n        BATCHSIZE = self.params['bs']\n        HS = self.params['hs']\n\n        # Here, the *rows* of w and hebb are the inputs weights to a single neuron\n        # hidden = x, hactiv = y\n        hactiv = self.activ(self.i2h(inputs).view(BATCHSIZE, HS, 1) + torch.matmul(self.w,\n                        hidden.view(BATCHSIZE, HS, 1))).view(BATCHSIZE, HS)\n        #hactiv = self.activ(self.i2h(inputs).view(BATCHSIZE, HS, 1) + torch.matmul((self.w + torch.mul(self.alpha, hebb)),\n        #                hidden.view(BATCHSIZE, HS, 1))).view(BATCHSIZE, HS)\n        activout = self.h2o(hactiv)  # Pure linear, raw scores - will be softmaxed by the calling program\n        valueout = self.h2v(hactiv)\n\n        hidden = hactiv\n\n        return activout, valueout, hidden #, hebb\n\n\n    def initialZeroState(self):\n        BATCHSIZE = self.params['bs']\n        return Variable(torch.zeros(BATCHSIZE, self.params['hs']), requires_grad=False ).cuda()\n\n\n\n\n\nclass PlasticRNN(nn.Module):\n    def __init__(self, params):\n        super(PlasticRNN, self).__init__()\n        # NOTE: 'outputsize' excludes the value and neuromodulator outputs!\n        for paramname in ['outputsize', 'inputsize', 'hs', 'bs', 'fm']:\n            if paramname not in params.keys():\n                raise KeyError(\"Must provide missing key in argument 'params': \"+paramname)\n        NBDA = 1 # For now we limit the number of neuromodulatory-output neurons to 1\n        # Doesn't work with our version of PyTorch:\n        #self.device = torch.device(\"cuda:0\" if self.params['device'] == 'gpu' else \"cpu\")\n        self.params = params\n        self.activ = F.tanh\n        self.i2h = torch.nn.Linear(self.params['inputsize'], params['hs']).cuda()\n        self.w =  torch.nn.Parameter((.01 * torch.t(torch.rand(params['hs'], params['hs']))).cuda(), requires_grad=True) \n        self.alpha =  torch.nn.Parameter((.01 * torch.t(torch.rand(params['hs'], params['hs']))).cuda(), requires_grad=True)\n        self.eta = torch.nn.Parameter((.1 * torch.ones(1)).cuda(), requires_grad=True)  # Everyone has the same eta\n        #self.h2DA = torch.nn.Linear(params['hs'], NBDA).cuda()\n        self.h2o = torch.nn.Linear(params['hs'], self.params['outputsize']).cuda()\n        self.h2v = torch.nn.Linear(params['hs'], 1).cuda()\n\n    def forward(self, inputs, hidden, hebb):\n        BATCHSIZE = self.params['bs']\n        HS = self.params['hs']\n\n        # Here, the *rows* of w and hebb are the inputs weights to a single neuron\n        # hidden = x, hactiv = y\n        hactiv = self.activ(self.i2h(inputs).view(BATCHSIZE, HS, 1) + torch.matmul((self.w + torch.mul(self.alpha, hebb)),\n                        hidden.view(BATCHSIZE, HS, 1))).view(BATCHSIZE, HS)\n        activout = self.h2o(hactiv)  # Pure linear, raw scores - will be softmaxed by the calling program\n        valueout = self.h2v(hactiv)\n\n        # Now computing the Hebbian updates...\n        \n        # deltahebb has shape BS x HS x HS\n        # Each row of hebb contain the input weights to a neuron\n        deltahebb =  torch.bmm(hactiv.view(BATCHSIZE, HS, 1), hidden.view(BATCHSIZE, 1, HS)) # batched outer product...should it be other way round?\n        hebb = torch.clamp(hebb + self.eta * deltahebb, min=-1.0, max=1.0)\n\n        hidden = hactiv\n\n        return activout, valueout, hidden, hebb\n\n    def initialZeroHebb(self):\n        return Variable(torch.zeros(self.params['bs'], self.params['hs'], self.params['hs']) , requires_grad=False).cuda()\n\n    def initialZeroState(self):\n        BATCHSIZE = self.params['bs']\n        return Variable(torch.zeros(BATCHSIZE, self.params['hs']), requires_grad=False ).cuda()\n\n\n\n\nclass SimpleModulRNN(nn.Module):\n    def __init__(self, params):\n        super(SimpleModulRNN, self).__init__()\n        # NOTE: 'outputsize' excludes the value and neuromodulator outputs!\n        for paramname in ['outputsize', 'inputsize', 'hs', 'bs', 'fm']:\n            if paramname not in params.keys():\n                raise KeyError(\"Must provide missing key in argument 'params': \"+paramname)\n        NBDA = 1 # For now we limit the number of neuromodulatory-output neurons to 1\n        # Doesn't work with our version of PyTorch:\n        #self.device = torch.device(\"cuda:0\" if self.params['device'] == 'gpu' else \"cpu\")\n        self.params = params\n        self.activ = F.tanh\n        self.i2h = torch.nn.Linear(self.params['inputsize'], params['hs']).cuda()\n        self.w =  torch.nn.Parameter((.01 * torch.t(torch.rand(params['hs'], params['hs']))).cuda(), requires_grad=True) \n        self.alpha =  torch.nn.Parameter((.01 * torch.t(torch.rand(params['hs'], params['hs']))).cuda(), requires_grad=True)\n        self.eta = torch.nn.Parameter((.1 * torch.ones(1)).cuda(), requires_grad=True)  # Everyone has the same eta (only for the non-modulated part, if any!)\n        self.h2DA = torch.nn.Linear(params['hs'], NBDA).cuda()\n        self.h2o = torch.nn.Linear(params['hs'], self.params['outputsize']).cuda()\n        self.h2v = torch.nn.Linear(params['hs'], 1).cuda()\n\n    def forward_test(self, inputs, hidden, hebb):\n        NBDA = 1\n        BATCHSIZE = self.params['bs']\n        HS = self.params['hs']\n        hactiv = self.activ(self.i2h(inputs).view(BATCHSIZE, HS, 1) + torch.matmul(self.w,\n                        hidden.view(BATCHSIZE, HS, 1))).view(BATCHSIZE, HS)\n        activout = self.h2o(hactiv)  # Pure linear, raw scores - will be softmaxed by the calling program\n        valueout = self.h2v(hactiv)\n        return activout, valueout, 0, hidden, hebb\n\n    def forward(self, inputs, hidden, hebb):\n        NBDA = 1\n        BATCHSIZE = self.params['bs']\n        HS = self.params['hs']\n\n        # Here, the *rows* of w and hebb are the inputs weights to a single neuron\n        # hidden = x, hactiv = y\n        hactiv = self.activ(self.i2h(inputs).view(BATCHSIZE, HS, 1) + torch.matmul((self.w + torch.mul(self.alpha, hebb)),\n                        hidden.view(BATCHSIZE, HS, 1))).view(BATCHSIZE, HS)\n        activout = self.h2o(hactiv)  # Pure linear, raw scores - will be softmaxed by the calling program\n        valueout = self.h2v(hactiv)\n\n        # Now computing the Hebbian updates...\n        \n        # With batching, DAout is a matrix of size BS x 1 (Really BS x NBDA, but we assume NBDA=1 for now in the deltahebb multiplication below)\n        if self.params['da'] == 'tanh':\n            DAout = F.tanh(self.h2DA(hactiv))\n        elif self.params['da'] == 'sig':\n            DAout = F.sigmoid(self.h2DA(hactiv))\n        elif self.params['da'] == 'lin':\n            DAout =  self.h2DA(hactiv)\n        else:\n            raise ValueError(\"Which transformation for DAout ?\")\n        \n        # deltahebb has shape BS x HS x HS\n        # Each row of hebb contain the input weights to a neuron\n        deltahebb =  torch.bmm(hactiv.view(BATCHSIZE, HS, 1), hidden.view(BATCHSIZE, 1, HS)) # batched outer product...should it be other way round?\n\n\n        hebb1 = torch.clamp(hebb + DAout.view(BATCHSIZE, 1, 1) * deltahebb, min=-1.0, max=1.0)\n        if self.params['fm'] == 0:\n            # Non-modulated part\n            hebb2 = torch.clamp(hebb + self.eta * deltahebb, min=-1.0, max=1.0)\n        # Soft Clamp (note that it's different from just putting a tanh on top of a freely varying value):\n        #hebb1 = torch.clamp( hebb +  torch.clamp(DAout.view(BATCHSIZE, 1, 1) * deltahebb, min=0.0) * (1 - hebb) +  \n        #        torch.clamp(DAout.view(BATCHSIZE, 1, 1)  * deltahebb, max=0.0) * (hebb + 1) , min=-1.0, max=1.0)\n        #hebb2 = torch.clamp( hebb +  torch.clamp(self.eta * deltahebb, min=0.0) * (1 - hebb) +  torch.clamp(self.eta * deltahebb, max=0.0) * (hebb + 1) , min=-1.0, max=1.0)\n        # Purely additive, no clamping. This will almost certainly diverge, don't use it! \n        #hebb1 = hebb + DAout.view(BATCHSIZE, 1, 1) * deltahebb\n        #hebb2 = hebb + self.eta * deltahebb\n\n        if self.params['fm'] == 1:\n            hebb = hebb1\n        elif self.params['fm'] == 0:\n            # Combine the modulated and non-modulated part\n            hebb = torch.cat( (hebb1[:, :self.params['hs']//2, :], hebb2[:,  self.params['hs'] // 2:, :]), dim=1) # Maybe along dim=2 instead?...\n        else:\n            raise ValueError(\"Must select whether fully modulated or not (params['fm'])\")\n\n        hidden = hactiv\n\n        return activout, valueout, DAout, hidden, hebb\n\n    def initialZeroHebb(self):\n        return Variable(torch.zeros(self.params['bs'], self.params['hs'], self.params['hs']) , requires_grad=False).cuda()\n\n    def initialZeroState(self):\n        BATCHSIZE = self.params['bs']\n        return Variable(torch.zeros(BATCHSIZE, self.params['hs']), requires_grad=False ).cuda()\n\n\n\n\n\nclass RetroModulRNN(nn.Module):\n    def __init__(self, params):\n        super(RetroModulRNN, self).__init__()\n        # NOTE: 'outputsize' excludes the value and neuromodulator outputs!\n        for paramname in ['outputsize', 'inputsize', 'hs', 'bs', 'fm']:\n            if paramname not in params.keys():\n                raise KeyError(\"Must provide missing key in argument 'params': \"+paramname)\n        NBDA = 1 # For now we limit the number of neuromodulatory-output neurons to 1\n        # Doesn't work with our version of PyTorch:\n        #self.device = torch.device(\"cuda:0\" if self.params['device'] == 'gpu' else \"cpu\")\n        self.params = params\n        self.activ = F.tanh\n        self.i2h = torch.nn.Linear(self.params['inputsize'], params['hs']).cuda()\n        self.w =  torch.nn.Parameter((.01 * torch.t(torch.rand(params['hs'], params['hs']))).cuda(), requires_grad=True) \n        self.alpha =  torch.nn.Parameter((.01 * torch.t(torch.rand(params['hs'], params['hs']))).cuda(), requires_grad=True)\n        self.eta = torch.nn.Parameter((.1 * torch.ones(1)).cuda(), requires_grad=True)  # Everyone has the same eta (only for the non-modulated part, if any!)\n        self.etaet = torch.nn.Parameter((.1 * torch.ones(1)).cuda(), requires_grad=True)  # Everyone has the same etaet\n        self.h2DA = torch.nn.Linear(params['hs'], NBDA).cuda()\n        self.h2o = torch.nn.Linear(params['hs'], self.params['outputsize']).cuda()\n        self.h2v = torch.nn.Linear(params['hs'], 1).cuda()\n\n    def forward(self, inputs, hidden, hebb, et, pw):\n            NBDA = 1\n            BATCHSIZE = self.params['bs']\n            HS = self.params['hs']\n    \n            hactiv = self.activ(self.i2h(inputs).view(BATCHSIZE, HS, 1) + torch.matmul((self.w + torch.mul(self.alpha, pw)),\n                            hidden.view(BATCHSIZE, HS, 1))).view(BATCHSIZE, HS)\n            activout = self.h2o(hactiv)  # Pure linear, raw scores - will be softmaxed later\n            valueout = self.h2v(hactiv)\n\n            # Now computing the Hebbian updates...\n            \n            # With batching, DAout is a matrix of size BS x 1 (Really BS x NBDA, but we assume NBDA=1 for now in the deltahebb multiplication below)\n            if self.params['da'] == 'tanh':\n                DAout = F.tanh(self.h2DA(hactiv))\n            elif self.params['da'] == 'sig':\n                DAout = F.sigmoid(self.h2DA(hactiv))\n            elif self.params['da'] == 'lin':\n                DAout =  self.h2DA(hactiv)\n            else:\n                raise ValueError(\"Which transformation for DAout ?\")\n            \n            if self.params['rule'] == 'hebb':\n                deltahebb =  torch.bmm(hactiv.view(BATCHSIZE, HS, 1), hidden.view(BATCHSIZE, 1, HS)) # batched outer product...should it be other way round?\n            elif self.params['rule'] == 'oja':\n                deltahebb =  torch.mul(hactiv.view(BATCHSIZE, HS, 1), (hidden.view(BATCHSIZE, 1, HS) - torch.mul(self.w.view(1, HS, HS), hactiv.view(BATCHSIZE, HS, 1))))\n            else:\n                raise ValueError(\"Must specify learning rule ('hebb' or 'oja')\")\n\n            # Hard clamp\n            deltapw = DAout.view(BATCHSIZE,1,1) * et\n            pw1 = torch.clamp(pw + deltapw, min=-1.0, max=1.0)\n            \n            # Should we have a fully neuromodulated network, or only half?\n            if self.params['fm'] == 1:\n                pw = pw1\n            elif self.params['fm']==0:\n                hebb = torch.clamp(hebb + self.eta * deltahebb, min=-1.0, max=1.0)\n                pw = torch.cat( (hebb[:, :self.params['hs']//2, :], pw1[:,  self.params['hs'] // 2:, :]), dim=1) # Maybe along dim=2 instead?...\n            else:\n                raise ValueError(\"Must select whether fully modulated or not\")\n\n            # Updating the eligibility trace - always a simple decay term. \n            # Note that self.etaet != self.eta (which is used for hebb, i.e. the non-modulated part)\n            deltaet = deltahebb\n            et = (1 - self.etaet) * et + self.etaet *  deltaet\n            \n            hidden = hactiv\n            return activout, valueout, DAout, hidden, hebb, et, pw\n        \n        \n        \n\n    def initialZeroHebb(self):\n        return Variable(torch.zeros(self.params['bs'], self.params['hs'], self.params['hs']) , requires_grad=False).cuda()\n    \n    def initialZeroPlasticWeights(self):\n        return Variable(torch.zeros(self.params['bs'], self.params['hs'], self.params['hs']) , requires_grad=False).cuda()\n    def initialZeroState(self):\n        return Variable(torch.zeros(self.params['bs'], self.params['hs']), requires_grad=False ).cuda()\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n"
  },
  {
    "path": "sr/opus.docker.old",
    "content": "#tmiconi_rl\n#latest\n#.\n\n\n#FROM localhost:5000/opus-deep-learning:master-test-2017_9_7_20_56_10\n#FROM opus-deep-learning:master-test-2018_1_3_0_38_14\nFROM opus-deep-learning:master-prod-2018_9_20_18_2_31\n\n\n\nRUN mkdir /home/work\n\nCOPY ./*.py /home/work/\n\nENV LC_ALL C.UTF-8\nENV  LANG C.UTF-8\n\n"
  },
  {
    "path": "sr/plotmodulator.py",
    "content": "import numpy as np; import matplotlib.pyplot as plt          \n\nc = np.load('cueshown0.dat.npy'); r = np.load('rewardsprevstep0.dat.npy') ; m = np.load('modulator0.dat.npy')\n\nparams = {'legend.fontsize': 'x-large',\n               'axes.labelsize': 'x-large',\n                'axes.titlesize':'x-large',\n                'xtick.labelsize':'x-large',\n                'ytick.labelsize':'x-large'}\nplt.rcParams.update(params)\n\nfig = plt.figure(figsize=(13,10))\n\nfor numgraph in range(c.shape[0]):\n    finalgraph=0\n    if numgraph == c.shape[0] - 1:\n        finalgraph=1\n    ax1 = plt.subplot(c.shape[0]+1, 1, numgraph+1)\n    if numgraph == 0:\n        ax1.set_title('Retroactive neuromodulation')\n    z = np.zeros((6, c[numgraph].size))\n\n    for nn in range(c[numgraph].size):\n        z[np.int(c[numgraph][nn]+1), nn]=1\n    if finalgraph:\n        ax1.set_xlabel('Timestep')\n    ax1.set_xlim(-.5,120.5)\n    ax1.set_ylim(-.5,5.5)\n\n    ax1.imshow(1-z, cmap='gray',clim=(-1,1), aspect='auto')\n    ax1.set_yticks([0,1,2,3,4,5])\n    ax1.set_yticklabels(labels=[\"No cue\", \"Cue 1\", \"Cue 2\", \"Cue 3\", \"Cue 4\", \"Response cue\"])\n    \n    ax2 = ax1.twinx()\n    ax2.set_ylim(-1,1)\n    ax2.plot(m[numgraph], label=\"Modulator\", lw=2)\n    ax2.plot(r[numgraph], label=\"Reward\", lw=2)\n    ax2.plot(np.zeros_like(r[numgraph]), 'k:')\n    if finalgraph:\n        ax2.legend(loc='upper left', bbox_to_anchor=(0, -.2))\n\n\nplt.tight_layout()  # Too tight!\n#fig.subplots_adjust(hspace=0.5)\n\nplt.show()\n"
  },
  {
    "path": "sr/plotresults.py",
    "content": "import numpy as np\nimport glob\nimport matplotlib.pyplot as plt\nimport scipy\nfrom scipy import stats\n\ncolorz = ['r', 'b', 'g', 'c', 'm', 'y', 'orange', 'k']\n\n\n#groupnames = glob.glob('./tmp/loss*CS*cs_10*is_0*lr_3*seed_0.txt')  +  glob.glob('./tmp/loss*CS*eplen_50*seed_0.txt')  \n#groupnames = glob.glob('./tmp/loss*CS*cs_20*eplen_75*eps_1e-06*is_0*seed_0.txt')  # Least bad; lr 1e-4: modul unstable, 3e-5: slow, modul even slower\n#groupnames = glob.glob('./tmp/loss*CS*cs_20*eplen_75*eps_1e-06*gc*is_0*lr_0.00*seed_0.txt')  \n#groupnames = glob.glob('./tmp/loss*gc_7.*seed_0.txt') # see gc 10, 7, 20. For a comparison of many gc's, look at modplast only. \n#groupnames = glob.glob('./tmp/loss*ni_4*seed_0.txt') \n#groupnames = glob.glob('./tmp/loss*SRB*seed_0.txt') \ngroupnames = glob.glob('./tmp/loss*SRB*bent_0.1*cs_*gc_2.0*ni_4*seed_0.txt') \n#groupnames = glob.glob('./tmp/loss*SRB*ni_2*seed_0.txt') \n\n\n#groupnames = glob.glob('./tmp/loss*lvlB*ni_2*seed_0.txt') ; groupnames = [x for x in groupnames if not 'modul2' in x] \n#groupnames = glob.glob('./tmp/loss*Rnd*ni_2*seed_0.txt') ; groupnames = [x for x in groupnames if not 'modul2' in x] \n#groupnames = glob.glob('./tmp/loss*CS*cs_20*eps_1e-06*is_0*seed_0.txt')  \n\n\n#groupnames = glob.glob('./tmp/loss*eps*seed_0.txt')  \n#groupnames = glob.glob('./tmp/loss*NewAdam*addpw_*seed_0.txt')  \n#groupnames = glob.glob('./tmp/loss*EASY*addpw_*ni_2*seed_0.txt')  \n#groupnames = glob.glob('./tmp/loss*SGD*ni_2*seed_0.txt')  \n#groupnames = glob.glob('./tmp/loss*ni_2*seed_0.txt')  \n#groupnames = glob.glob('./tmp/loss*eplen_140*ni_2*seed_0.txt')  \n#groupnames = glob.glob('./tmp/loss*seed_0.txt')  \n\n\n\n\n# If you can only use 7 runs, smooth the losses within each run to obtain more reliable estimates of performance!\n\n\ndef mavg(x, N):\n  cumsum = np.cumsum(np.insert(x, 0, 0)) \n  return (cumsum[N:] - cumsum[:-N]) / N\n\nplt.ion()\n#plt.figure(figsize=(5,4))  # Smaller figure = relative larger fonts\nplt.figure()\n\nallmedianls = []\nalllosses = []\nposcol = 0\nmaxminlen = 0\nminminlen = 999999\nfor numgroup, groupname in enumerate(groupnames):\n    if \"batch\"  in groupname:\n        continue\n    #if \"lstm\" not in groupname:\n    #    continue\n    g = groupname[:-6]+\"*\"\n    print(\"====\", groupname)\n    fnames = glob.glob(g)\n    fulllosses=[]\n    losses=[]\n    lgts=[]\n    for fn in fnames:\n        if \"COPY\" in fn:\n            continue\n        if False:\n            #if \"seed_4\" in fn:\n            #    continue\n            #if \"seed_7\" in fn:\n            #    continue\n            if \"seed_3\" in fn:\n                continue\n            #if \"seed_9\" in fn:\n            #    continue\n            #if \"seed_10\" in fn:\n            #    continue\n            if \"seed_11\" in fn:\n                continue\n            if \"seed_12\" in fn:\n                continue\n            if \"seed_13\" in fn:\n                continue\n            if \"seed_14\" in fn:\n                continue\n            if \"seed_15\" in fn:\n                continue\n        z = np.loadtxt(fn)\n        \n        #z = mavg(z, 10)  # For each run, we average the losses over K successive episodes\n\n        z = z[::10] # Decimation - speed things up!\n\n        z = z[:1800]\n\n        print(fn, len(z))\n        if False:\n            if len(z) < 300:\n                print(fn, len(z))\n                continue\n        lgts.append(len(z))\n        fulllosses.append(z)\n    minlen = min(lgts)\n    if minlen > maxminlen:\n        maxminlen = minlen\n    if minlen < minminlen:\n        minminlen = minlen\n    print(\"Minlen:\", minlen)\n    #if minlen < 1000:\n    #    continue\n    for z in fulllosses:\n        losses.append(z[:minlen])\n\n    losses = np.array(losses)\n    alllosses.append(losses)\n    \n    meanl = np.mean(losses, axis=0)\n    stdl = np.std(losses, axis=0)\n    cil = stdl / np.sqrt(losses.shape[0]) * 1.96  # 95% confidence interval - assuming normality\n    #cil = stdl / np.sqrt(losses.shape[0]) * 2.5  # 95% confidence interval - approximated with the t-distribution for 7 d.f.\n\n    medianl = np.median(losses, axis=0)\n    allmedianls.append(medianl)\n    q1l = np.percentile(losses, 25, axis=0)\n    q3l = np.percentile(losses, 75, axis=0)\n    \n    highl = np.max(losses, axis=0)\n    lowl = np.min(losses, axis=0)\n    #highl = meanl+stdl\n    #lowl = meanl-stdl\n\n    xx = range(len(meanl))\n\n    # xticks and labels\n    xt = range(0, maxminlen, 500)\n    #xt = range(0, len(meanl), 100)\n    #xt = range(0, len(meanl), 1000)\n    #xt = range(0, 10001, 2000)\n    xtl = [str(10 * 10 * i) for i in xt]   # Because of decimation above, and only every 10th loss is recorded in the files\n\n    #plt.plot(mavg(meanl, 100), label=g) #, color='blue')\n    #plt.fill_between(xx, lowl, highl,  alpha=.2)\n    #plt.fill_between(xx, q1l, q3l,  alpha=.1)\n    #plt.plot(meanl) #, color='blue')\n    ####plt.plot(mavg(medianl, 100), label=g) #, color='blue')  # mavg changes the number of points !\n    #plt.plot(mavg(q1l, 100), label=g, alpha=.3) #, color='blue')\n    #plt.plot(mavg(q3l, 100), label=g, alpha=.3) #, color='blue')\n    #plt.fill_between(xx, q1l, q3l,  alpha=.2)\n    #plt.plot(medianl, label=g) #, color='blue')\n   \n    AVGSIZE = 10  # 20\n    \n    xlen = len(mavg(q1l, AVGSIZE))\n    #mylabel = g[g.find('type'):]\n    mylabel = g\n    if numgroup < 8:\n        zestyle = '-'\n    else:\n        zestyle = '--'\n    \n    zew=2\n    #if 'tanh' in g:\n    #    zew = 3\n    #elif 'sig' in g:\n    #    zew = 1\n    #if 'pw_3' in g:\n    #    zew = 3\n    #elif 'pw_2' in g:\n    #    zew = 1\n    #else:\n    #    raise ValueError(\"Which width?\")\n    \n    plt.plot(mavg(medianl, AVGSIZE), label=mylabel, color=colorz[poscol % len(colorz)], ls=zestyle, lw=zew)  # mavg changes the number of points !\n    plt.fill_between( range(xlen), mavg(q1l, AVGSIZE), mavg(q3l, AVGSIZE),  alpha=.2, color=colorz[poscol % len(colorz)])\n    \n    #xlen = len(mavg(meanl, AVGSIZE))\n    #plt.plot(mavg(meanl, AVGSIZE), label=g, color=colorz[poscol % len(colorz)])  # mavg changes the number of points !\n    #plt.fill_between( range(xlen), mavg(meanl - cil, AVGSIZE), mavg(meanl + cil, AVGSIZE),  alpha=.2, color=colorz[poscol % len(colorz)])\n    \n    poscol += 1\n    \n    #plt.fill_between( range(xlen), mavg(lowl, 100), mavg(highl, 100),  alpha=.2, color=colorz[numgroup % len(colorz)])\n\n    #plt.plot(mavg(losses[0], 1000), label=g, color=colorz[numgroup % len(colorz)])\n    #for curve in losses[1:]:\n    #    plt.plot(mavg(curve, 1000), color=colorz[numgroup % len(colorz)])\n\nps = []\n# Adapt for varying lengths across groups\n#for n in range(0, alllosses[0].shape[1], 3):\n\n#for n in range(0, minminlen):\n#    ps.append(scipy.stats.ranksums(alllosses[0][:,n], alllosses[1][:,n]).pvalue)\n#ps = np.array(ps)\n\nplt.legend(loc='best', fontsize=6)\n#plt.xlabel('Loss (sum square diff. b/w final output and target)')\nplt.xlabel('Number of Episodes')\nplt.ylabel('Loss')\nplt.xticks(xt, xtl)\n#plt.tight_layout()\n\n\n\n"
  },
  {
    "path": "sr/request.json",
    "content": "{\n    \"dockerImage\":\"tmiconi_rl\", \n    \"tag\":\"master-test-2018_9_21_9_55_16\",\n    \"name\":\"Exp3lvlCS5_gc5_15runs_bent0.03_bv0.1_hs200_rew1_wp0_A3C_clamp0_eplen120_addpw3_ni4_l20_modplast_datanh_fm1_pf0_lr1e-4_cs10_eps1e-6_is0_NFS\",\n    \"cpus\":2.0,\n    \"cmdLine\":\"export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 \\u0026\\u0026  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/nvidia/bin:/opt/hadoop/latest/bin \\u0026\\u0026 export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/nvidia/lib64/  \\u0026\\u0026  export LC_ALL=C.UTF-8  \\u0026\\u0026   export LANG=C.UTF-8  \\u0026\\u0026  cd /home/work/ \\u0026\\u0026 python3 sr.py  --eplen 120 --hs 200 --rule hebb --lr 1e-4 --l2 0 --addpw 3 --pe 1000 --bv 0.1 --bent 0.03 --rew 1 --wp 0 --save_every 5000 --type modplast --da tanh --clamp 0 --nbiter 200000 --fm 1 --ni 4 --pf .0 --alg A3C --cs 10  --eps 1e-6 --is 0 --gc 5.0 --rngseed {{mesos.instance}}\",\n    \"ramMB\":6000,\n    \"gpus\":1,\n    \"diskMB\":6000,\n    \"cluster\":\"opusprodda1\",\n    \"environment\":\"devel\",\n    \"user\":\"tmiconi\",\n    \"resourcePool\": \"/ailabs/p1/tmiconi\",\n    \"instances\":10,\n    \"isService\":false,\n    \"cronSchedule\":\"\",\n    \"custom\":{},\n    \"application\":\"testversion\",\n    \"maxRetries\":1,\n    \"constraints\":{\"sku\":\"1080ti\"},\n    \"accessTypes\":[],\n    \"dependencies\":[],\n    \"cronCollisionPolicy\":\"CANCEL_NEW\",\n    \"emailOnFail\":[],\n    \"emailOnSucceed\":[]\n}\n"
  },
  {
    "path": "sr/request_batch.json",
    "content": "{\n    \"dockerImage\":\"tmiconi_rl\", \n    \"tag\":\"master-test-2018_10_9_15_13_17\",\n    \"name\":\"ExpSRbatch6_gc2.0_10runs_bent0.1_bv0.1_hs200_rew1_wp0_A3C_clamp0_eplen120_ni4_l20_modul_datanh_fm1_pf0_lr1e-4_cs20_eps1e-6_is0_NFS\",\n    \"cpus\":2.0,\n    \"cmdLine\":\"export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 \\u0026\\u0026  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/nvidia/bin:/opt/hadoop/latest/bin \\u0026\\u0026 export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/nvidia/lib64/  \\u0026\\u0026  export LC_ALL=C.UTF-8  \\u0026\\u0026   export LANG=C.UTF-8  \\u0026\\u0026  cd /home/work/ \\u0026\\u0026 python3 srbatch.py --eplen 120 --hs 200 --lr 1e-4 --l2 0 --pe 500 --bv 0.1 --bent 0.1 --rew 1 --wp 0 --save_every 2000 --type modul --da tanh --clamp 0 --nbiter 200000 --fm 1 --ni 4 --pf .0 --alg A3C --cs 20 --eps 1e-6 --is 0 --bs 30 --gc 2.0 --rngseed {{mesos.instance}}\",\n    \"ramMB\":6000,\n    \"gpus\":1,\n    \"diskMB\":6000,\n    \"cluster\":\"opusprodda1\",\n    \"environment\":\"devel\",\n    \"user\":\"tmiconi\",\n    \"resourcePool\": \"/ailabs/p1/tmiconi\",\n    \"instances\":10,\n    \"isService\":false,\n    \"cronSchedule\":\"\",\n    \"custom\":{},\n    \"application\":\"testversion\",\n    \"maxRetries\":1,\n    \"constraints\":{\"sku\":\"1080ti\"},\n    \"accessTypes\":[],\n    \"dependencies\":[],\n    \"cronCollisionPolicy\":\"CANCEL_NEW\",\n    \"emailOnFail\":[],\n    \"emailOnSucceed\":[]\n}\n"
  },
  {
    "path": "sr/request_easy.json",
    "content": "{\n    \"dockerImage\":\"tmiconi_rl\", \n    \"tag\":\"master-test-2018_10_4_11_45_47\",\n    \"name\":\"ExpSRbatch3_gc2.5_10runs_bent0.1_bv0.1_hs200_rew1_wp0_A3C_clamp0_eplen75_ni2_l20_plastic_datanh_fm1_pf0_lr1e-4_cs2_eps1e-6_is0_NFS\",\n    \"cpus\":2.0,\n    \"cmdLine\":\"export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 \\u0026\\u0026  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/nvidia/bin:/opt/hadoop/latest/bin \\u0026\\u0026 export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/nvidia/lib64/  \\u0026\\u0026  export LC_ALL=C.UTF-8  \\u0026\\u0026   export LANG=C.UTF-8  \\u0026\\u0026  cd /home/work/ \\u0026\\u0026 python3 srbatch.py --eplen 75 --hs 200 --lr 1e-4 --l2 0 --pe 500 --bv 0.1 --bent 0.1 --rew 1 --wp 0 --save_every 2000 --type plastic --da tanh --clamp 0 --nbiter 100000 --fm 1 --ni 2 --pf .0 --alg A3C --cs 2 --eps 1e-6 --is 0 --bs 30 --gc 2.5 --rngseed {{mesos.instance}}\",\n    \"ramMB\":6000,\n    \"gpus\":1,\n    \"diskMB\":6000,\n    \"cluster\":\"opusprodda1\",\n    \"environment\":\"devel\",\n    \"user\":\"tmiconi\",\n    \"resourcePool\": \"/ailabs/p1/tmiconi\",\n    \"instances\":10,\n    \"isService\":false,\n    \"cronSchedule\":\"\",\n    \"custom\":{},\n    \"application\":\"testversion\",\n    \"maxRetries\":1,\n    \"constraints\":{\"sku\":\"1080ti\"},\n    \"accessTypes\":[],\n    \"dependencies\":[],\n    \"cronCollisionPolicy\":\"CANCEL_NEW\",\n    \"emailOnFail\":[],\n    \"emailOnSucceed\":[]\n}\n"
  },
  {
    "path": "sr/srbatch.py",
    "content": "# Stimulus-response task as described in Miconi et al. ICLR 2019.\n\n# Copyright (c) 2018-2019 Uber Technologies, Inc.\n#\n# Licensed under the Uber Non-Commercial License (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at the root directory of this project. \n\nimport argparse\nimport pdb\nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\nimport numpy as np\nfrom numpy import random\nimport torch.nn.functional as F\nfrom torch import optim\nfrom torch.optim import lr_scheduler\nimport random\nimport sys\nimport pickle\nimport time\nimport os\nimport platform\n#import makemaze\n\nimport numpy as np\n#import matplotlib.pyplot as plt\nimport glob\n\nimport modul  # The code for the actual backrpopamine network\n\n\n\n\nnp.set_printoptions(precision=4)\n\n\n\nADDINPUT = 4 # 1 inputs for the previous reward, 1 inputs for numstep, 1 unused,  1 \"Bias\" inputs\n\n\ndef train(paramdict):\n    #params = dict(click.get_current_context().params)\n\n    #params['inputsize'] =  RFSIZE * RFSIZE + ADDINPUT + NBNONRESTACTIONS\n    print(\"Starting training...\")\n    params = {}\n    #params.update(defaultParams)\n    params.update(paramdict)\n    print(\"Passed params: \", params)\n    print(platform.uname())\n    #params['nbsteps'] = params['nbshots'] * ((params['prestime'] + params['interpresdelay']) * params['nbclasses']) + params['prestimetest']  # Total number of steps per episode\n    suffix = \"SRB_\"+\"\".join([str(x)+\"_\" if pair[0] != 'pe' and pair[0] != 'nbsteps' and pair[0] != 'rngseed' and pair[0] != 'save_every' and pair[0] != 'test_every' else '' for pair in sorted(zip(params.keys(), params.values()), key=lambda x:x[0] ) for x in pair])[:-1] + \"_rngseed_\" + str(params['rngseed'])   # Turning the parameters into a nice suffix for filenames\n    print(suffix)\n\n    #NBINPUTBITS = params['ni'] + 1 \n    NBINPUTBITS = params['cs'] + 1 # The additional bit is for the response cue (i.e. the \"Go\" cue)\n    params['outputsize'] =  2  # \"response\" and \"no response\"\n    params['inputsize'] = NBINPUTBITS +  params['outputsize'] + ADDINPUT  # The total number of input bits is the size of inputs, plus the \"response cue\" input, plus the number of actions, plus the number of additional inputs\n\n    # This doesn't work with our version of PyTorch\n    #params['device'] = 'gpu'\n    #device = torch.device(\"cuda:0\" if self.params['device'] == 'gpu' else \"cpu\")\n    BS = params['bs']\n\n    # Initialize random seeds (first two redundant?)\n    print(\"Setting random seeds\")\n    np.random.seed(params['rngseed']); random.seed(params['rngseed']); torch.manual_seed(params['rngseed'])\n    #print(click.get_current_context().params)\n\n    print(\"Initializing network\")\n    if params['type'] == 'modul':\n        net = modul.RetroModulRNN(params)\n    elif params['type'] == 'modplast':\n        net = modul.SimpleModulRNN(params)\n    elif params['type'] == 'plastic':\n        net = modul.PlasticRNN(params)\n    elif params['type'] == 'rnn':\n        net = modul.NonPlasticRNN(params)\n    else:\n        raise ValueError(\"Network type unknown or not yet implemented: \"+params['type'])\n\n    print (\"Shape of all optimized parameters:\", [x.size() for x in net.parameters()])\n    allsizes = [torch.numel(x.data.cpu()) for x in net.parameters()]\n    print (\"Size (numel) of all optimized elements:\", allsizes)\n    print (\"Total size (numel) of all optimized elements:\", sum(allsizes))\n\n    #total_loss = 0.0\n    print(\"Initializing optimizer\")\n    #optimizer = torch.optim.SGD(net.parameters(), lr=1.0*params['lr'], weight_decay=params['l2'])\n    #optimizer = torch.optim.RMSprop(net.parameters(), lr=1.0*params['lr'], weight_decay=params['l2'])\n    optimizer = torch.optim.Adam(net.parameters(), lr=1.0*params['lr'], eps=params['eps'], weight_decay=params['l2'])\n    #optimizer = torch.optim.Adam(net.parameters(), lr=1.0*params['lr'], eps=1e-4, weight_decay=params['l2'])\n    #optimizer = torch.optim.SGD(net.parameters(), lr=1.0*params['lr'])\n    #scheduler = torch.optim.lr_scheduler.StepLR(optimizer, gamma=params['gamma'], step_size=params['steplr'])\n\n    #LABSIZE = params['lsize']\n    #lab = np.ones((LABSIZE, LABSIZE))\n    #CTR = LABSIZE // 2\n\n    # Simple cross maze\n    #lab[CTR, 1:LABSIZE-1] = 0\n    #lab[1:LABSIZE-1, CTR] = 0\n\n\n    # Double-T maze\n    #lab[CTR, 1:LABSIZE-1] = 0\n    #lab[1:LABSIZE-1, 1] = 0\n    #lab[1:LABSIZE-1, LABSIZE - 2] = 0\n\n    # Grid maze\n    #lab[1:LABSIZE-1, 1:LABSIZE-1].fill(0)\n    #for row in range(1, LABSIZE - 1):\n    #    for col in range(1, LABSIZE - 1):\n    #        if row % 2 == 0 and col % 2 == 0:\n    #            lab[row, col] = 1\n    #lab[CTR,CTR] = 0 # Not strictly necessary, but perhaps helps loclization by introducing a detectable irregularity in the center\n\n\n    #LABSIZE = params['msize'] \n    #lab = np.ones((LABSIZE, LABSIZE))\n    #CTR = LABSIZE // 2 \n\n\n    ## Grid maze\n    #lab[1:LABSIZE-1, 1:LABSIZE-1].fill(0)\n    #for row in range(1, LABSIZE - 1):\n    #    for col in range(1, LABSIZE - 1):\n    #        if row % 2 == 0 and col % 2 == 0:\n    #            lab[row, col] = 1\n    #lab[CTR,CTR] = 0 # Not strictly necessary, but perhaps helps loclization by introducing a detectable irregularity in the center\n\n\n\n    all_losses = []\n    all_grad_norms = []\n    all_losses_objective = []\n    all_total_rewards = []\n    all_losses_v = []\n    lossbetweensaves = 0\n    nowtime = time.time()\n    #meanreward = np.zeros((LABSIZE, LABSIZE))\n    meanreward = np.zeros(params['ni'])\n    meanrewardT = np.zeros((params['ni'], params['eplen']))\n\n    nbtrials = [0]*BS\n    totalnbtrials = 0\n    nbtrialswithcc = 0\n\n\n    print(\"Starting episodes!\")\n\n    for numepisode in range(params['nbiter']):\n\n        PRINTTRACE = 0\n        #if (numepisode+1) % (1 + params['pe']) == 0:\n        if (numepisode+1) % (params['pe']) == 0:\n            PRINTTRACE = 1\n\n        #lab = makemaze.genmaze(size=LABSIZE, nblines=4)\n        #count = np.zeros((LABSIZE, LABSIZE))\n\n        # # Select the reward location for this episode - not on a wall!\n        # rposr = 0; rposc = 0\n        # while lab[rposr, rposc] == 1:\n        #     rposr = np.random.randint(1, LABSIZE - 1)\n        #     rposc = np.random.randint(1, LABSIZE - 1)\n\n        # # We always start the episode from the center (when hitting reward, we may teleport either to center or to a random location depending on params['rsp'])\n        # posc = CTR\n        # posr = CTR\n\n\n\n        \n        optimizer.zero_grad()\n        loss = 0\n        lossv = 0\n        hidden = net.initialZeroState()\n        if params['type'] != 'rnn':\n            hebb = net.initialZeroHebb()\n        if params['type'] == 'modul':\n            et = net.initialZeroHebb() # Eligibility Trace is identical to Hebbian Trace in shape\n            pw = net.initialZeroPlasticWeights()\n        numactionchosen = 0\n\n\n        # Generate the cues. Make sure they're all different (important when using very small cues for debugging, e.g. cs=2, ni=2)\n        cuedata=[]\n        for nb in range(BS):\n            cuedata.append([])\n            for ncue in range(params['ni']):\n                assert len(cuedata[nb]) == ncue\n                foundsame = 1\n                cpt = 0\n                while foundsame > 0 :\n                    cpt += 1\n                    if cpt > 10000:\n                        # This should only occur with very weird parameters, e.g. cs=2, ni>4\n                        raise ValueError(\"Could not generate a full list of different cues\")\n                    foundsame = 0\n                    candidate = np.random.randint(2, size=params['cs']) * 2 - 1\n                    for backtrace in range(ncue):\n                        if np.array_equal(cuedata[nb][backtrace], candidate):\n                            foundsame = 1\n\n                cuedata[nb].append(candidate)\n\n\n        reward = np.zeros(BS)\n        sumreward = np.zeros(BS)\n        rewards = []\n        vs = []\n        logprobs = []\n        cues=[]\n        for nb in range(BS):\n            cues.append([])\n        dist = 0\n        numactionschosen = np.zeros(BS, dtype='int32')\n\n        #reward = 0.0\n        #rewards = []\n        #vs = []\n        #logprobs = []\n        #sumreward = 0.0\n        nbtrials = np.zeros(BS)\n        nbrewardabletrials = np.zeros(BS)\n        thistrialhascorrectcue = np.zeros(BS)\n        triallength = np.zeros(BS, dtype='int32')\n        correctcue = np.random.randint(params['ni'], size=BS)\n\n        trialstep = np.zeros(BS, dtype='int32')  \n\n        #print(\"EPISODE \", numepisode)\n        for numstep in range(params['eplen']):\n\n            #if params['clamp'] == 0:\n            inputs = np.zeros((BS, params['inputsize']), dtype='float32') \n            #else:\n            #    inputs = np.zeros((1, params['hs']), dtype='float32')\n\n            for nb in range(BS):\n            \n                if trialstep[nb] == 0:\n                    thistrialhascorrectcue[nb] = 0\n                    # Trial length is randomly modulated for each trial; first time step always -1 (i.e. no input cue), last time step always response-cue (i.e. NBINPUTBITS-1).\n                    #triallength = params['ni'] // 2  + 3 + np.random.randint(1 + params['ni'])  # 3 fixed-cue time steps (1st, last and next-to-last) + some random nb of no-cue time steps\n                    triallength[nb] = params['ni'] // 2  + 3 + np.random.randint(params['ni'])  # 3 fixed-cue time steps (1st, last and next-to-last) + some random nb of no-cue time steps\n                    \n                    \n                    \n                    # In any trial, we only show half the cues (randomly chosen), once each:\n                    mycues = [x for x in range(params['ni'])]\n                    random.shuffle(mycues); mycues = mycues[:len(mycues) // 2]\n                    # The rest is filled with no-input time steps (i.e. cue = -1), but also with the 3 fixed-cue steps (1st, last, next-to-last) \n                    for nc in range(triallength[nb] - 3  - len(mycues)):\n                        mycues.append(-1)\n                    random.shuffle(mycues)\n                    mycues.insert(0, -1); mycues.append(params['ni']); mycues.append(-1)  # The first and last time step have no input (cue -1), the next-to-last has the response cue.\n                    assert(len(mycues) == triallength[nb])\n                    cues[nb] = mycues\n\n            \n                inputs[nb, :NBINPUTBITS] = 0\n                if cues[nb][trialstep[nb]] > -1 and cues[nb][trialstep[nb]] < params['ni']:\n                    #inputs[0, cues[trialstep]] = 1.0\n                    inputs[nb, :NBINPUTBITS-1] = cuedata[nb][cues[nb][trialstep[nb]]][:]\n                    if cues[nb][trialstep[nb]] == correctcue[nb]:\n                        thistrialhascorrectcue[nb] = 1\n                if cues[nb][trialstep[nb]] == params['ni']:\n                    inputs[nb, NBINPUTBITS-1] = 1  # \"Go\" cue\n                    \n\n                inputs[nb, NBINPUTBITS + 0] = 1.0 # Bias neuron, probably not necessary\n                inputs[nb,NBINPUTBITS +  1] = numstep / params['eplen']\n                inputs[nb, NBINPUTBITS + 2] = 1.0 * reward[nb] # Reward from previous time step\n                if numstep > 0:\n                    inputs[nb, NBINPUTBITS + ADDINPUT + numactionschosen[nb]] = 1  # Previously chosen action\n\n            inputsC = torch.from_numpy(inputs).cuda()\n            # Might be better:\n            #if rposr == posr and rposc = posc:\n            #    inputs[0][-4] = 100.0\n            #else:\n            #    inputs[0][-4] = 0\n            \n            # Running the network\n\n            ## Running the network\n            if params['type'] == 'modplast':\n                y, v, DAout, hidden, hebb = net(Variable(inputsC, requires_grad=False), hidden, hebb)  # y  should output raw scores, not probas\n            elif params['type'] == 'modul':\n                y, v, DAout, hidden, hebb, et, pw  = net(Variable(inputsC, requires_grad=False), hidden, hebb, et, pw)  # y  should output raw scores, not probas\n            elif params['type'] == 'plastic':\n                y, v, hidden, hebb = net(Variable(inputsC, requires_grad=False), hidden, hebb)  # y  should output raw scores, not probas\n            elif params['type'] == 'rnn':\n                y, v, hidden = net(Variable(inputsC, requires_grad=False), hidden)  # y  should output raw scores, not probas\n            else:\n                raise ValueError(\"Network type unknown or not yet implemented!\")\n\n\n\n            y = F.softmax(y, dim=1)\n            # Must convert y to probas to use this !\n            distrib = torch.distributions.Categorical(y)\n            actionschosen = distrib.sample()  \n            logprobs.append(distrib.log_prob(actionschosen))\n            numactionschosen = actionschosen.data.cpu().numpy()    # Turn to scalar\n\n            if PRINTTRACE:\n                print(\"Step \", numstep, \" Inputs (1st in batch): \", inputs[0,:params['inputsize']], \" - Outputs(0): \", y.data.cpu().numpy()[0,:], \" - action chosen(0): \", numactionschosen[0],\n                        \"TrialLen(0):\", triallength[0], \"trialstep(0):\", trialstep[0], \"TTHCC(0): \", thistrialhascorrectcue[0], \" -Reward (previous step): \", reward[0], \", cues(0):\", cues[0], \", cc(0):\", correctcue[0])\n\n                #print(\"Step \", numstep, \" Inputs: \", inputs[0,:params['inputsize']], \" - Outputs: \", y.data.cpu().numpy(), \" - action chosen: \", numactionchosen,\n                #        \" - mean abs pw: \", np.mean(np.abs(pw.data.cpu().numpy())), \"TrialLen:\", triallength, \"trialstep:\", trialstep, \"TTHCC: \", thistrialhascorrectcue, \" -Reward (previous step): \", reward, \", cues:\", cues, \", cc:\", correctcue)\n\n            reward = np.zeros(BS, dtype='float32')\n\n            for nb in range(BS):\n                if numactionschosen[nb] == 1:\n                    # Small penalty for any non-rest action taken\n                    reward[nb]  -= params['wp']\n            \n            \n            ### DEBUGGING\n            ## Easiest possible episode-dependent response (i.e. the easiest\n            ## possible problem that actually require meta-learning, with ni=2)\n            ## This one works pretty wel... But harder ones don't work well!\n            #if numactionchosen == correctcue :\n            #        reward = params['rew']\n            #else:\n            #        reward = -params['rew']\n\n\n                trialstep[nb] += 1\n                if trialstep[nb] == triallength[nb] - 1:\n                    # This was the next-to-last step of the trial (and we showed the response signal, unless it was the first few steps in episode). \n                    assert(cues[nb][trialstep[nb] - 1] == params['ni'] or numstep < 2)\n                    # We must deliver reward (which will be perceived by the agent at the next step), positive or negative, depending on response\n                    if thistrialhascorrectcue[nb] and numactionschosen[nb] == 1:\n                        reward[nb] += params['rew']\n                    elif (not thistrialhascorrectcue[nb]) and numactionschosen[nb] == 0:\n                        reward[nb] += params['rew']\n                    else:\n                        reward[nb] -= params['rew']\n\n                    if np.random.rand() < params['pf']:\n                        reward[nb] = -reward[nb]\n                \n                if trialstep[nb] == triallength[nb]:\n                    # This was the last step of the trial (and we showed no input)\n                    assert(cues[nb][trialstep[nb] - 1] == -1 or numstep < 2)\n                    nbtrials[nb] += 1\n                    totalnbtrials += 1\n                    if thistrialhascorrectcue[nb]:\n                        nbtrialswithcc += 1\n                        #nbrewardabletrials += 1 \n                    # Trial is dead, long live trial\n                    trialstep[nb] = 0\n\n                    # We initialize the hidden state between trials!\n                    #if params['is'] == 1:\n                    #    hidden = net.initialZeroState()\n\n\n\n            rewards.append(reward)\n            vs.append(v)\n            sumreward += reward\n\n\n\n            #if params['alg'] in ['A3C' , 'REIE' , 'REIT']:\n            \n            loss += (params['bent'] * y.pow(2).sum() / BS )   # We want to penalize concentration, i.e. encourage diversity; our version of PyTorch does not have an entropy() function for Distribution, so we use this instead.\n\n            \n\n            ##if PRINTTRACE:\n            ##    print(\"Probabilities:\", y.data.cpu().numpy(), \"Picked action:\", numactionchosen, \", got reward\", reward)\n        \n        R = Variable(torch.zeros(BS).cuda(), requires_grad=False)\n        gammaR = params['gr']\n        for numstepb in reversed(range(params['eplen'])) :\n            R = gammaR * R + Variable(torch.from_numpy(rewards[numstepb]).cuda(), requires_grad=False)\n            ctrR = R - vs[numstepb][0]\n            lossv += ctrR.pow(2).sum() / BS\n            loss -= (logprobs[numstepb] * ctrR.detach()).sum() / BS  # Need to check if detach() is OK\n            #pdb.set_trace()\n\n\n        # Episode is done, now let's do the actual computations\n        #gammaR = params['gr']\n        #if params['alg'] == 'A3C':\n        #    R = 0\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        R = gammaR * R + rewards[numstepb]\n        #        lossv += (vs[numstepb][0] - R).pow(2)\n        #        loss -= logprobs[numstepb] * (R - vs[numstepb].data[0][0])  # Not sure if the \"data\" is needed... put it b/c of worry about weird gradient flows\n        #    loss += params['bv'] * lossv\n\n        #elif params['alg'] in ['REI', 'REIE']:\n        #    R = sumreward\n        #    baseline = meanreward[correctcue]\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        loss -= logprobs[numstepb] * (R - baseline)\n        #elif params['alg'] == 'REIT':\n        #    R = 0\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        R = gammaR * R + rewards[numstepb]\n        #        loss -= logprobs[numstepb] * (R - meanrewardT[correctcue, numstepb])\n        #else:\n        #    raise ValueError(\"Must select algo type\")\n        #elif params['alg'] == 'REINOB':\n        #    R = sumreward\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        loss -= logprobs[numstepb] * R\n        #elif params['alg'] == 'REITMP':\n        #    R = 0\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        R = gammaR * R + rewards[numstepb]\n        #        loss -= logprobs[numstepb] * R\n\n        #else:\n        #    raise ValueError(\"Which algo?\")\n\n        #meanreward[correctcue] = (1.0 - params['nu']) * meanreward[correctcue] + params['nu'] * sumreward\n        ##meanreward[rposr, rposc] = (1.0 - params['nu']) * meanreward[rposr, rposc] + params['nu'] * sumreward\n        #R = 0\n        #for numstepb in reversed(range(params['eplen'])) :\n        #    R = gammaR * R + rewards[numstepb]\n        #    meanrewardT[correctcue, numstepb] = (1.0 - params['nu']) * meanrewardT[correctcue, numstepb] + params['nu'] * R\n\n        loss += params['blossv'] * lossv\n        loss /= params['eplen']\n\n        if PRINTTRACE:\n            #if params['alg'] == 'A3C':\n            print(\"lossv: \", float(lossv))\n            #elif params['alg'] in ['REI', 'REIE', 'REIT']:\n            #    print(\"meanreward baselines: \", [meanreward[x] for x in range(params['ni'])])\n            print (\"Total reward for this episode(0):\", sumreward[0], \"Prop. of trials w/ rewarded cue:\", (nbtrialswithcc / totalnbtrials))\n            print(\"Nb trials for this episode(0):\", nbtrials[0], \"[2]:\",nbtrials[2],\" Total Nb of trials:\", totalnbtrials)\n\n        #if params['squash'] == 1:\n        #    if sumreward < 0:\n        #        sumreward = -np.sqrt(-sumreward)\n        #    else:\n        #        sumreward = np.sqrt(sumreward)\n        #elif params['squash'] == 0:\n        #    pass\n        #else:\n        #    raise ValueError(\"Incorrect value for squash parameter\")\n\n        #loss *= sumreward\n\n        #for p in net.parameters():\n        #    p.grad.data.clamp_(-params['clamp'], params['clamp'])\n        loss.backward()\n        all_grad_norms.append(torch.nn.utils.clip_grad_norm(net.parameters(), params['gc']))\n        if numepisode > 100:  # Burn-in period for meanreward\n            optimizer.step()\n\n\n        #print(sumreward)\n        lossnum = float(loss)\n        lossbetweensaves += lossnum\n        all_losses_objective.append(lossnum)\n        all_total_rewards.append(sumreward.mean())\n        #all_total_rewards.append(sumreward[0])\n            #all_losses_v.append(lossv.data[0])\n        #total_loss  += lossnum\n\n\n        if (numepisode+1) % params['pe'] == 0:\n\n            print(numepisode, \"====\")\n            print(\"Mean loss: \", lossbetweensaves / params['pe'])\n            lossbetweensaves = 0\n            print(\"Mean reward: \", np.sum(all_total_rewards[-params['pe']:])/ params['pe'])\n            previoustime = nowtime\n            nowtime = time.time()\n            print(\"Time spent on last\", params['pe'], \"iters: \", nowtime - previoustime)\n            if params['type'] == 'plastic' or params['type'] == 'lstmplastic':\n                print(\"ETA: \", float(net.eta), \"alpha[0,1]: \", net.alpha.data.cpu().numpy()[0,1], \"w[0,1]: \", net.w.data.cpu().numpy()[0,1] )\n            elif params['type'] == 'modul' or params['type'] == 'modul2':\n                print(\"ETA: \", net.eta.data.cpu().numpy(), \" etaet: \", net.etaet.data.cpu().numpy(), \" mean-abs pw: \", np.mean(np.abs(pw.data.cpu().numpy())))\n            elif params['type'] == 'rnn':\n                print(\"w[0,1]: \", net.w.data.cpu().numpy()[0,1] )\n\n        if (numepisode+1) % params['save_every'] == 0:\n            print(\"Saving files...\")\n#            lossbetweensaves /= params['save_every']\n#            print(\"Average loss over the last\", params['save_every'], \"episodes:\", lossbetweensaves)\n#            print(\"Alternative computation (should be equal):\", np.mean(all_losses_objective[-params['save_every']:]))\n            losslast100 = np.mean(all_losses_objective[-100:])\n            print(\"Average loss over the last 100 episodes:\", losslast100)\n#            # Instability detection; necessary for SELUs, which seem to be divergence-prone\n#            # Note that if we are unlucky enough to have diverged within the last 100 timesteps, this may not save us.\n#            if losslast100 > 2 * lossbetweensavesprev:\n#                print(\"We have diverged ! Restoring last savepoint!\")\n#                net.load_state_dict(torch.load('./torchmodel_'+suffix + '.txt'))\n#            else:\n            print(\"Saving local files...\")\n            #with open('params_'+suffix+'.dat', 'wb') as fo:\n            #        #pickle.dump(net.w.data.cpu().numpy(), fo)\n            #        #pickle.dump(net.alpha.data.cpu().numpy(), fo)\n            #        #pickle.dump(net.eta.data.cpu().numpy(), fo)\n            #        #pickle.dump(all_losses, fo)\n            #        pickle.dump(params, fo)\n            #with open('loss_'+suffix+'.txt', 'w') as thefile:\n            #    for item in all_losses_objective:\n            #            thefile.write(\"%s\\n\" % item)\n            #with open('lossv_'+suffix+'.txt', 'w') as thefile:\n            #    for item in all_losses_v:\n            #            thefile.write(\"%s\\n\" % item)\n            #with open('grads_'+suffix+'.txt', 'w') as thefile:\n            #    for item in all_grad_norms[::10]:\n            #            thefile.write(\"%s\\n\" % item)\n            with open('loss_'+suffix+'.txt', 'w') as thefile:\n                for item in all_total_rewards[::10]:\n                        thefile.write(\"%s\\n\" % item)\n            torch.save(net.state_dict(), 'torchmodel_'+suffix+'.dat')\n            with open('params_'+suffix+'.dat', 'wb') as fo:\n                pickle.dump(params, fo)\n            print(\"Saving HDFS files...\")\n            if os.path.isdir('/mnt/share/tmiconi'):\n                print(\"Transferring to NFS storage...\")\n                for fn in ['params_'+suffix+'.dat', 'loss_'+suffix+'.txt', 'torchmodel_'+suffix+'.dat']:\n                    result = os.system(\n                        'cp {} {}'.format(fn, '/mnt/share/tmiconi/3level/'+fn))\n                print(\"Done!\")\n#            lossbetweensavesprev = lossbetweensaves\n#            lossbetweensaves = 0\n#            sys.stdout.flush()\n#            sys.stderr.flush()\n\n\n\nif __name__ == \"__main__\":\n#defaultParams = {\n#    'type' : 'lstm',\n#    'seqlen' : 200,\n#    'hs': 500,\n#    'activ': 'tanh',\n#    'steplr': 10e9,  # By default, no change in the learning rate\n#    'gamma': .5,  # The annealing factor of learning rate decay for Adam\n#    'imagesize': 31,\n#    'nbiter': 30000,\n#    'lr': 1e-4,\n#    'test_every': 10,\n#    'save_every': 3000,\n#    'rngseed':0\n#}\n\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"--rngseed\", type=int, help=\"random seed\", default=0)\n    #parser.add_argument(\"--clamp\", type=float, help=\"maximum (absolute value) gradient for clamping\", default=1000000.0)\n    #parser.add_argument(\"--wp\", type=float, help=\"wall penalty (reward decrement for hitting a wall)\", default=0.1)\n    parser.add_argument(\"--rew\", type=float, help=\"reward value (reward increment for taking correct action after correct stimulus)\", default=1.0)\n    parser.add_argument(\"--wp\", type=float, help=\"penalty for hitting walls\", default=.0)\n    #parser.add_argument(\"--pen\", type=float, help=\"penalty value (reward decrement for taking any non-rest action)\", default=.2)\n    #parser.add_argument(\"--exprew\", type=float, help=\"reward value (reward increment for hitting reward location)\", default=.0)\n    parser.add_argument(\"--bent\", type=float, help=\"coefficient for the entropy reward (really Simpson index concentration measure)\", default=0.03)\n    parser.add_argument(\"--blossv\", type=float, help=\"coefficient for value prediction loss\", default=.1)\n    #parser.add_argument(\"--probarev\", type=float, help=\"probability of reversal (random change) in desired stimulus-response, per time step\", default=0.0)\n    parser.add_argument(\"--bv\", type=float, help=\"coefficient for value prediction loss\", default=.1)\n    #parser.add_argument(\"--lsize\", type=int, help=\"size of the labyrinth; must be odd\", default=7)\n    #parser.add_argument(\"--randstart\", type=int, help=\"when hitting reward, should we teleport to random location (1) or center (0)?\", default=0)\n    #parser.add_argument(\"--rp\", type=int, help=\"whether the reward should be on the periphery\", default=0)\n    #parser.add_argument(\"--squash\", type=int, help=\"squash reward through signed sqrt (1 or 0)\", default=0)\n    #parser.add_argument(\"--nbarms\", type=int, help=\"number of arms\", default=2)\n    #parser.add_argument(\"--nbseq\", type=int, help=\"number of sequences between reinitializations of hidden/Hebbian state and position\", default=3)\n    #parser.add_argument(\"--activ\", help=\"activ function ('tanh' or 'selu')\", default='tanh')\n    parser.add_argument(\"--alg\", help=\"meta-learning algorithm (A3C or REI or REIE or REIT)\", default='REIT')\n    parser.add_argument(\"--rule\", help=\"learning rule ('hebb' or 'oja')\", default='hebb')\n    parser.add_argument(\"--type\", help=\"network type ('lstm' or 'rnn' or 'plastic')\", default='modul')\n    #parser.add_argument(\"--msize\", type=int, help=\"size of the maze; must be odd\", default=9)\n    parser.add_argument(\"--da\", help=\"transformation function of DA signal (tanh or sig or lin)\", default='tanh')\n    parser.add_argument(\"--gr\", type=float, help=\"gammaR: discounting factor for rewards\", default=.9)\n    parser.add_argument(\"--lr\", type=float, help=\"learning rate (Adam optimizer)\", default=1e-4)\n    parser.add_argument(\"--fm\", type=int, help=\"if using neuromodulation, do we modulate the whole network (1) or just half (0) ?\", default=1)\n    #parser.add_argument(\"--na\", type=int, help=\"number of actions (excluding \\\"rest\\\" action)\", default=2)\n    parser.add_argument(\"--ni\", type=int, help=\"number of different inputs\", default=2)\n    parser.add_argument(\"--nu\", type=float, help=\"REINFORCE baseline time constant\", default=.1)\n    #parser.add_argument(\"--samestep\", type=int, help=\"compare stimulus and response in the same step (1) or from successive steps (0) ?\", default=0)\n    #parser.add_argument(\"--nbin\", type=int, help=\"number of possible inputs stimulis\", default=4)\n    #parser.add_argument(\"--modhalf\", type=int, help=\"which half of the recurrent netowkr receives modulation (1 or 2)\", default=1)\n    #parser.add_argument(\"--nbac\", type=int, help=\"number of possible non-rest actions\", default=4)\n    #parser.add_argument(\"--rsp\", type=int, help=\"does the agent start each episode from random position (1) or center (0) ?\", default=1)\n    parser.add_argument(\"--addpw\", type=int, help=\"are plastic weights purely additive (1) or forgetting (0) ?\", default=2)\n    parser.add_argument(\"--clamp\", type=int, help=\"inputs clamped (1), fully clamped (2) or through linear layer (0) ?\", default=0)\n    parser.add_argument(\"--eplen\", type=int, help=\"length of episodes\", default=100)\n    #parser.add_argument(\"--exptime\", type=int, help=\"exploration (no reward) time (must be < eplen)\", default=0)\n    parser.add_argument(\"--hs\", type=int, help=\"size of the recurrent (hidden) layer\", default=100)\n    parser.add_argument(\"--is\", type=int, help=\"do we initialize hidden state after each trial (1) or not (0) ?\", default=0)\n    parser.add_argument(\"--cs\", type=int, help=\"cue size - number of bits for each cue\", default=10)\n    parser.add_argument(\"--pf\", type=float, help=\"probability of flipping the reward (.5 = pure noise)\", default=0)\n    parser.add_argument(\"--l2\", type=float, help=\"coefficient of L2 norm (weight decay)\", default=1e-5)\n    parser.add_argument(\"--bs\", type=int, help=\"batch size\", default=1)\n    parser.add_argument(\"--gc\", type=float, help=\"gradient clipping\", default=1000.0)\n    parser.add_argument(\"--eps\", type=float, help=\"epsilon for Adam optimizer\", default=1e-6)\n    #parser.add_argument(\"--steplr\", type=int, help=\"duration of each step in the learning rate annealing schedule\", default=100000000)\n    #parser.add_argument(\"--gamma\", type=float, help=\"learning rate annealing factor\", default=0.3)\n    parser.add_argument(\"--nbiter\", type=int, help=\"number of learning cycles\", default=1000000)\n    parser.add_argument(\"--save_every\", type=int, help=\"number of cycles between successive save points\", default=200)\n    parser.add_argument(\"--pe\", type=int, help=\"'print every', number of cycles between successive printing of information\", default=100)\n    #parser.add_argument(\"--\", type=int, help=\"\", default=1e-4)\n    args = parser.parse_args(); argvars = vars(args); argdict =  { k : argvars[k] for k in argvars if argvars[k] != None }\n    #train()\n    train(argdict)\n\n"
  },
  {
    "path": "sr/srrun.py",
    "content": "import argparse\nimport pdb\nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\nimport numpy as np\nfrom numpy import random\nimport torch.nn.functional as F\nfrom torch import optim\nfrom torch.optim import lr_scheduler\nimport random\nimport sys\nimport pickle\nimport time\nimport os\nimport platform\n#import makemaze\n\nimport numpy as np\n#import matplotlib.pyplot as plt\nimport glob\n\nimport modul\n\n\n\n\nnp.set_printoptions(precision=4)\n\n\n\nADDINPUT = 4 # 1 inputs for the previous reward, 1 inputs for numstep, 1 unused,  1 \"Bias\" inputs\n\n\ndef train(paramdict):\n    #params = dict(click.get_current_context().params)\n\n    #params['inputsize'] =  RFSIZE * RFSIZE + ADDINPUT + NBNONRESTACTIONS\n\n    suffix = 'SRB_addpw_2_alg_A3C_bent_0.1_blossv_0.1_bs_30_bv_0.1_clamp_0_cs_20_da_tanh_eplen_120_eps_1e-06_fm_1_gc_2.0_gr_0.9_hs_200_is_0_l2_0.0_lr_0.0001_nbiter_200000_ni_4_nu_0.1_pf_0.0_rew_1.0_rule_hebb_type_modul_wp_0.0_rngseed_11'\n    \n    print(\"Starting training...\")\n    params = {}\n    #params.update(defaultParams)\n    params.update(paramdict)\n\n    with open('./params_'+suffix+'.dat', 'rb') as fo:\n        params = pickle.load(fo)\n\n\n    params['bs'] = 1\n\n\n    print(\"Used params: \", params)\n    print(platform.uname())\n    #params['nbsteps'] = params['nbshots'] * ((params['prestime'] + params['interpresdelay']) * params['nbclasses']) + params['prestimetest']  # Total number of steps per episode\n    #NBINPUTBITS = params['ni'] + 1 \n    NBINPUTBITS = params['cs'] + 1 # The additional bit is for the response cue (i.e. the \"Go\" cue)\n    params['outputsize'] =  2  # \"response\" and \"no response\"\n    params['inputsize'] = NBINPUTBITS +  params['outputsize'] + ADDINPUT  # The total number of input bits is the size of inputs, plus the \"response cue\" input, plus the number of actions, plus the number of additional inputs\n\n    # This doesn't work with our version of PyTorch\n    #params['device'] = 'gpu'\n    #device = torch.device(\"cuda:0\" if self.params['device'] == 'gpu' else \"cpu\")\n    BS = params['bs']\n\n    # Initialize random seeds (first two redundant?)\n    print(\"Setting random seeds\")\n    np.random.seed(params['rngseed']); random.seed(params['rngseed']); torch.manual_seed(params['rngseed'])\n    #print(click.get_current_context().params)\n\n    print(\"Initializing network\")\n    if params['type'] == 'modul':\n        net = modul.RetroModulRNN(params)\n    elif params['type'] == 'modplast':\n        net = modul.SimpleModulRNN(params)\n    elif params['type'] == 'plastic':\n        net = modul.PlasticRNN(params)\n    elif params['type'] == 'rnn':\n        net = modul.NonPlasticRNN(params)\n    else:\n        raise ValueError(\"Network type unknown or not yet implemented: \"+params['type'])\n\n    print (\"Shape of all optimized parameters:\", [x.size() for x in net.parameters()])\n    allsizes = [torch.numel(x.data.cpu()) for x in net.parameters()]\n    print (\"Size (numel) of all optimized elements:\", allsizes)\n    print (\"Total size (numel) of all optimized elements:\", sum(allsizes))\n\n    #total_loss = 0.0\n    print(\"Initializing optimizer\")\n   \n    #optimizer = torch.optim.Adam(net.parameters(), lr=1.0*params['lr'], eps=params['eps'], weight_decay=params['l2'])\n\n\n    all_losses = []\n    all_grad_norms = []\n    all_losses_objective = []\n    all_total_rewards = []\n    all_losses_v = []\n    lossbetweensaves = 0\n    nowtime = time.time()\n    #meanreward = np.zeros((LABSIZE, LABSIZE))\n    meanreward = np.zeros(params['ni'])\n    meanrewardT = np.zeros((params['ni'], params['eplen']))\n\n    nbtrials = [0]*BS\n    totalnbtrials = 0\n    nbtrialswithcc = 0\n\n\n    print(\"Starting episodes!\")\n\n    for numepisode in range(params['nbiter']):\n\n        PRINTTRACE = 0\n        #if (numepisode+1) % (1 + params['pe']) == 0:\n        if (numepisode+1) % (params['pe']) == 0:\n            PRINTTRACE = 1\n\n        #lab = makemaze.genmaze(size=LABSIZE, nblines=4)\n        #count = np.zeros((LABSIZE, LABSIZE))\n\n        # # Select the reward location for this episode - not on a wall!\n        # rposr = 0; rposc = 0\n        # while lab[rposr, rposc] == 1:\n        #     rposr = np.random.randint(1, LABSIZE - 1)\n        #     rposc = np.random.randint(1, LABSIZE - 1)\n\n        # # We always start the episode from the center (when hitting reward, we may teleport either to center or to a random location depending on params['rsp'])\n        # posc = CTR\n        # posr = CTR\n\n\n\n        \n        optimizer.zero_grad()\n        loss = 0\n        lossv = 0\n        hidden = net.initialZeroState()\n        if params['type'] != 'rnn':\n            hebb = net.initialZeroHebb()\n        if params['type'] == 'modul':\n            et = net.initialZeroHebb() # Eligibility Trace is identical to Hebbian Trace in shape\n            pw = net.initialZeroPlasticWeights()\n        numactionchosen = 0\n\n\n        # Generate the cues. Make sure they're all different (important when using very small cues for debugging, e.g. cs=2, ni=2)\n        cuedata=[]\n        for nb in range(BS):\n            cuedata.append([])\n            for ncue in range(params['ni']):\n                assert len(cuedata[nb]) == ncue\n                foundsame = 1\n                cpt = 0\n                while foundsame > 0 :\n                    cpt += 1\n                    if cpt > 10000:\n                        # This should only occur with very weird parameters, e.g. cs=2, ni>4\n                        raise ValueError(\"Could not generate a full list of different cues\")\n                    foundsame = 0\n                    candidate = np.random.randint(2, size=params['cs']) * 2 - 1\n                    for backtrace in range(ncue):\n                        if np.array_equal(cuedata[nb][backtrace], candidate):\n                            foundsame = 1\n\n                cuedata[nb].append(candidate)\n\n\n        reward = np.zeros(BS)\n        sumreward = np.zeros(BS)\n        rewards = []\n        vs = []\n        logprobs = []\n        cues=[]\n        for nb in range(BS):\n            cues.append([])\n        dist = 0\n        numactionschosen = np.zeros(BS, dtype='int32')\n\n        #reward = 0.0\n        #rewards = []\n        #vs = []\n        #logprobs = []\n        #sumreward = 0.0\n        nbtrials = np.zeros(BS)\n        nbrewardabletrials = np.zeros(BS)\n        thistrialhascorrectcue = np.zeros(BS)\n        triallength = np.zeros(BS, dtype='int32')\n        correctcue = np.random.randint(params['ni'], size=BS)\n\n        trialstep = np.zeros(BS, dtype='int32')  \n\n        #print(\"EPISODE \", numepisode)\n        for numstep in range(params['eplen']):\n\n            #if params['clamp'] == 0:\n            inputs = np.zeros((BS, params['inputsize']), dtype='float32') \n            #else:\n            #    inputs = np.zeros((1, params['hs']), dtype='float32')\n\n            for nb in range(BS):\n            \n                if trialstep[nb] == 0:\n                    thistrialhascorrectcue[nb] = 0\n                    # Trial length is randomly modulated for each trial; first time step always -1 (i.e. no input cue), last time step always response-cue (i.e. NBINPUTBITS-1).\n                    #triallength = params['ni'] // 2  + 3 + np.random.randint(1 + params['ni'])  # 3 fixed-cue time steps (1st, last and next-to-last) + some random nb of no-cue time steps\n                    triallength[nb] = params['ni'] // 2  + 3 + np.random.randint(params['ni'])  # 3 fixed-cue time steps (1st, last and next-to-last) + some random nb of no-cue time steps\n                    \n                    \n                    \n                    # In any trial, we only show half the cues (randomly chosen), once each:\n                    mycues = [x for x in range(params['ni'])]\n                    random.shuffle(mycues); mycues = mycues[:len(mycues) // 2]\n                    # The rest is filled with no-input time steps (i.e. cue = -1), but also with the 3 fixed-cue steps (1st, last, next-to-last) \n                    for nc in range(triallength[nb] - 3  - len(mycues)):\n                        mycues.append(-1)\n                    random.shuffle(mycues)\n                    mycues.insert(0, -1); mycues.append(params['ni']); mycues.append(-1)  # The first and last time step have no input (cue -1), the next-to-last has the response cue.\n                    assert(len(mycues) == triallength[nb])\n                    cues[nb] = mycues\n\n            \n                inputs[nb, :NBINPUTBITS] = 0\n                if cues[nb][trialstep[nb]] > -1 and cues[nb][trialstep[nb]] < params['ni']:\n                    #inputs[0, cues[trialstep]] = 1.0\n                    inputs[nb, :NBINPUTBITS-1] = cuedata[nb][cues[nb][trialstep[nb]]][:]\n                    if cues[nb][trialstep[nb]] == correctcue[nb]:\n                        thistrialhascorrectcue[nb] = 1\n                if cues[nb][trialstep[nb]] == params['ni']:\n                    inputs[nb, NBINPUTBITS-1] = 1  # \"Go\" cue\n                    \n\n                inputs[nb, NBINPUTBITS + 0] = 1.0 # Bias neuron, probably not necessary\n                inputs[nb,NBINPUTBITS +  1] = numstep / params['eplen']\n                inputs[nb, NBINPUTBITS + 2] = 1.0 * reward[nb] # Reward from previous time step\n                if numstep > 0:\n                    inputs[nb, NBINPUTBITS + ADDINPUT + numactionschosen[nb]] = 1  # Previously chosen action\n\n            inputsC = torch.from_numpy(inputs).cuda()\n            # Might be better:\n            #if rposr == posr and rposc = posc:\n            #    inputs[0][-4] = 100.0\n            #else:\n            #    inputs[0][-4] = 0\n            \n            # Running the network\n\n            ## Running the network\n            if params['type'] == 'modplast':\n                y, v, DAout, hidden, hebb = net(Variable(inputsC, requires_grad=False), hidden, hebb)  # y  should output raw scores, not probas\n            elif params['type'] == 'modul':\n                y, v, DAout, hidden, hebb, et, pw  = net(Variable(inputsC, requires_grad=False), hidden, hebb, et, pw)  # y  should output raw scores, not probas\n            elif params['type'] == 'plastic':\n                y, v, hidden, hebb = net(Variable(inputsC, requires_grad=False), hidden, hebb)  # y  should output raw scores, not probas\n            elif params['type'] == 'rnn':\n                y, v, hidden = net(Variable(inputsC, requires_grad=False), hidden)  # y  should output raw scores, not probas\n            else:\n                raise ValueError(\"Network type unknown or not yet implemented!\")\n\n\n\n            y = F.softmax(y, dim=1)\n            # Must convert y to probas to use this !\n            distrib = torch.distributions.Categorical(y)\n            actionschosen = distrib.sample()  \n            logprobs.append(distrib.log_prob(actionschosen))\n            numactionschosen = actionschosen.data.cpu().numpy()    # Turn to scalar\n\n            if PRINTTRACE:\n                print(\"Step \", numstep, \" Inputs (1st in batch): \", inputs[0,:params['inputsize']], \" - Outputs(0): \", y.data.cpu().numpy()[0,:], \" - action chosen(0): \", numactionschosen[0],\n                        \"TrialLen(0):\", triallength[0], \"trialstep(0):\", trialstep[0], \"TTHCC(0): \", thistrialhascorrectcue[0], \" -Reward (previous step): \", reward[0], \", cues(0):\", cues[0], \", cc(0):\", correctcue[0])\n\n                #print(\"Step \", numstep, \" Inputs: \", inputs[0,:params['inputsize']], \" - Outputs: \", y.data.cpu().numpy(), \" - action chosen: \", numactionchosen,\n                #        \" - mean abs pw: \", np.mean(np.abs(pw.data.cpu().numpy())), \"TrialLen:\", triallength, \"trialstep:\", trialstep, \"TTHCC: \", thistrialhascorrectcue, \" -Reward (previous step): \", reward, \", cues:\", cues, \", cc:\", correctcue)\n\n            reward = np.zeros(BS, dtype='float32')\n\n            for nb in range(BS):\n                if numactionschosen[nb] == 1:\n                    # Small penalty for any non-rest action taken\n                    reward[nb]  -= params['wp']\n            \n            \n            ### DEBUGGING\n            ## Easiest possible episode-dependent response (i.e. the easiest\n            ## possible problem that actually require meta-learning, with ni=2)\n            ## This one works pretty wel... But harder ones don't work well!\n            #if numactionchosen == correctcue :\n            #        reward = params['rew']\n            #else:\n            #        reward = -params['rew']\n\n\n                trialstep[nb] += 1\n                if trialstep[nb] == triallength[nb] - 1:\n                    # This was the next-to-last step of the trial (and we showed the response signal, unless it was the first few steps in episode). \n                    assert(cues[nb][trialstep[nb] - 1] == params['ni'] or numstep < 2)\n                    # We must deliver reward (which will be perceived by the agent at the next step), positive or negative, depending on response\n                    if thistrialhascorrectcue[nb] and numactionschosen[nb] == 1:\n                        reward[nb] += params['rew']\n                    elif (not thistrialhascorrectcue[nb]) and numactionschosen[nb] == 0:\n                        reward[nb] += params['rew']\n                    else:\n                        reward[nb] -= params['rew']\n\n                    if np.random.rand() < params['pf']:\n                        reward[nb] = -reward[nb]\n                \n                if trialstep[nb] == triallength[nb]:\n                    # This was the last step of the trial (and we showed no input)\n                    assert(cues[nb][trialstep[nb] - 1] == -1 or numstep < 2)\n                    nbtrials[nb] += 1\n                    totalnbtrials += 1\n                    if thistrialhascorrectcue[nb]:\n                        nbtrialswithcc += 1\n                        #nbrewardabletrials += 1 \n                    # Trial is dead, long live trial\n                    trialstep[nb] = 0\n\n                    # We initialize the hidden state between trials!\n                    #if params['is'] == 1:\n                    #    hidden = net.initialZeroState()\n\n\n\n            rewards.append(reward)\n            vs.append(v)\n            sumreward += reward\n\n\n\n            #if params['alg'] in ['A3C' , 'REIE' , 'REIT']:\n            \n            loss += (params['bent'] * y.pow(2).sum() / BS )   # We want to penalize concentration, i.e. encourage diversity; our version of PyTorch does not have an entropy() function for Distribution, so we use this instead.\n\n            \n\n            ##if PRINTTRACE:\n            ##    print(\"Probabilities:\", y.data.cpu().numpy(), \"Picked action:\", numactionchosen, \", got reward\", reward)\n        \n        R = Variable(torch.zeros(BS).cuda(), requires_grad=False)\n        gammaR = params['gr']\n        for numstepb in reversed(range(params['eplen'])) :\n            R = gammaR * R + Variable(torch.from_numpy(rewards[numstepb]).cuda(), requires_grad=False)\n            ctrR = R - vs[numstepb][0]\n            lossv += ctrR.pow(2).sum() / BS\n            loss -= (logprobs[numstepb] * ctrR.detach()).sum() / BS  # Need to check if detach() is OK\n            #pdb.set_trace()\n\n\n        # Episode is done, now let's do the actual computations\n        #gammaR = params['gr']\n        #if params['alg'] == 'A3C':\n        #    R = 0\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        R = gammaR * R + rewards[numstepb]\n        #        lossv += (vs[numstepb][0] - R).pow(2)\n        #        loss -= logprobs[numstepb] * (R - vs[numstepb].data[0][0])  # Not sure if the \"data\" is needed... put it b/c of worry about weird gradient flows\n        #    loss += params['bv'] * lossv\n\n        #elif params['alg'] in ['REI', 'REIE']:\n        #    R = sumreward\n        #    baseline = meanreward[correctcue]\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        loss -= logprobs[numstepb] * (R - baseline)\n        #elif params['alg'] == 'REIT':\n        #    R = 0\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        R = gammaR * R + rewards[numstepb]\n        #        loss -= logprobs[numstepb] * (R - meanrewardT[correctcue, numstepb])\n        #else:\n        #    raise ValueError(\"Must select algo type\")\n        #elif params['alg'] == 'REINOB':\n        #    R = sumreward\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        loss -= logprobs[numstepb] * R\n        #elif params['alg'] == 'REITMP':\n        #    R = 0\n        #    for numstepb in reversed(range(params['eplen'])) :\n        #        R = gammaR * R + rewards[numstepb]\n        #        loss -= logprobs[numstepb] * R\n\n        #else:\n        #    raise ValueError(\"Which algo?\")\n\n        #meanreward[correctcue] = (1.0 - params['nu']) * meanreward[correctcue] + params['nu'] * sumreward\n        ##meanreward[rposr, rposc] = (1.0 - params['nu']) * meanreward[rposr, rposc] + params['nu'] * sumreward\n        #R = 0\n        #for numstepb in reversed(range(params['eplen'])) :\n        #    R = gammaR * R + rewards[numstepb]\n        #    meanrewardT[correctcue, numstepb] = (1.0 - params['nu']) * meanrewardT[correctcue, numstepb] + params['nu'] * R\n\n        loss += params['blossv'] * lossv\n        loss /= params['eplen']\n\n        if PRINTTRACE:\n            #if params['alg'] == 'A3C':\n            print(\"lossv: \", float(lossv))\n            #elif params['alg'] in ['REI', 'REIE', 'REIT']:\n            #    print(\"meanreward baselines: \", [meanreward[x] for x in range(params['ni'])])\n            print (\"Total reward for this episode(0):\", sumreward[0], \"Prop. of trials w/ rewarded cue:\", (nbtrialswithcc / totalnbtrials))\n            print(\"Nb trials for this episode(0):\", nbtrials[0], \"[2]:\",nbtrials[2],\" Total Nb of trials:\", totalnbtrials)\n\n        #if params['squash'] == 1:\n        #    if sumreward < 0:\n        #        sumreward = -np.sqrt(-sumreward)\n        #    else:\n        #        sumreward = np.sqrt(sumreward)\n        #elif params['squash'] == 0:\n        #    pass\n        #else:\n        #    raise ValueError(\"Incorrect value for squash parameter\")\n\n        #loss *= sumreward\n\n        #loss.backward()\n        #all_grad_norms.append(torch.nn.utils.clip_grad_norm(net.parameters(), params['gc']))\n        #if numepisode > 100:  # Burn-in period for meanreward\n        #    optimizer.step()\n\n\n        #print(sumreward)\n        lossnum = float(loss)\n        lossbetweensaves += lossnum\n        all_losses_objective.append(lossnum)\n        all_total_rewards.append(sumreward.mean())\n        #all_total_rewards.append(sumreward[0])\n            #all_losses_v.append(lossv.data[0])\n        #total_loss  += lossnum\n\n\n        if (numepisode+1) % params['pe'] == 0:\n\n            print(numepisode, \"====\")\n            print(\"Mean loss: \", lossbetweensaves / params['pe'])\n            lossbetweensaves = 0\n            print(\"Mean reward: \", np.sum(all_total_rewards[-params['pe']:])/ params['pe'])\n            previoustime = nowtime\n            nowtime = time.time()\n            print(\"Time spent on last\", params['pe'], \"iters: \", nowtime - previoustime)\n            if params['type'] == 'plastic' or params['type'] == 'lstmplastic':\n                print(\"ETA: \", float(net.eta), \"alpha[0,1]: \", net.alpha.data.cpu().numpy()[0,1], \"w[0,1]: \", net.w.data.cpu().numpy()[0,1] )\n            elif params['type'] == 'modul' or params['type'] == 'modul2':\n                print(\"ETA: \", net.eta.data.cpu().numpy(), \" etaet: \", net.etaet.data.cpu().numpy(), \" mean-abs pw: \", np.mean(np.abs(pw.data.cpu().numpy())))\n            elif params['type'] == 'rnn':\n                print(\"w[0,1]: \", net.w.data.cpu().numpy()[0,1] )\n\n        if (numepisode+1) % params['save_every'] == 0:\n            print(\"Saving files...\")\n#            lossbetweensaves /= params['save_every']\n#            print(\"Average loss over the last\", params['save_every'], \"episodes:\", lossbetweensaves)\n#            print(\"Alternative computation (should be equal):\", np.mean(all_losses_objective[-params['save_every']:]))\n            losslast100 = np.mean(all_losses_objective[-100:])\n            print(\"Average loss over the last 100 episodes:\", losslast100)\n#            # Instability detection; necessary for SELUs, which seem to be divergence-prone\n#            # Note that if we are unlucky enough to have diverged within the last 100 timesteps, this may not save us.\n#            if losslast100 > 2 * lossbetweensavesprev:\n#                print(\"We have diverged ! Restoring last savepoint!\")\n#                net.load_state_dict(torch.load('./torchmodel_'+suffix + '.txt'))\n#            else:\n            print(\"NOT saving files!\")\n#            lossbetweensavesprev = lossbetweensaves\n#            lossbetweensaves = 0\n#            sys.stdout.flush()\n#            sys.stderr.flush()\n\n\n\nif __name__ == \"__main__\":\n#defaultParams = {\n#    'type' : 'lstm',\n#    'seqlen' : 200,\n#    'hs': 500,\n#    'activ': 'tanh',\n#    'steplr': 10e9,  # By default, no change in the learning rate\n#    'gamma': .5,  # The annealing factor of learning rate decay for Adam\n#    'imagesize': 31,\n#    'nbiter': 30000,\n#    'lr': 1e-4,\n#    'test_every': 10,\n#    'save_every': 3000,\n#    'rngseed':0\n#}\n\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"--rngseed\", type=int, help=\"random seed\", default=0)\n    #parser.add_argument(\"--clamp\", type=float, help=\"maximum (absolute value) gradient for clamping\", default=1000000.0)\n    #parser.add_argument(\"--wp\", type=float, help=\"wall penalty (reward decrement for hitting a wall)\", default=0.1)\n    parser.add_argument(\"--rew\", type=float, help=\"reward value (reward increment for taking correct action after correct stimulus)\", default=1.0)\n    parser.add_argument(\"--wp\", type=float, help=\"penalty for hitting walls\", default=.0)\n    #parser.add_argument(\"--pen\", type=float, help=\"penalty value (reward decrement for taking any non-rest action)\", default=.2)\n    #parser.add_argument(\"--exprew\", type=float, help=\"reward value (reward increment for hitting reward location)\", default=.0)\n    parser.add_argument(\"--bent\", type=float, help=\"coefficient for the entropy reward (really Simpson index concentration measure)\", default=0.03)\n    parser.add_argument(\"--blossv\", type=float, help=\"coefficient for value prediction loss\", default=.1)\n    #parser.add_argument(\"--probarev\", type=float, help=\"probability of reversal (random change) in desired stimulus-response, per time step\", default=0.0)\n    parser.add_argument(\"--bv\", type=float, help=\"coefficient for value prediction loss\", default=.1)\n    #parser.add_argument(\"--lsize\", type=int, help=\"size of the labyrinth; must be odd\", default=7)\n    #parser.add_argument(\"--randstart\", type=int, help=\"when hitting reward, should we teleport to random location (1) or center (0)?\", default=0)\n    #parser.add_argument(\"--rp\", type=int, help=\"whether the reward should be on the periphery\", default=0)\n    #parser.add_argument(\"--squash\", type=int, help=\"squash reward through signed sqrt (1 or 0)\", default=0)\n    #parser.add_argument(\"--nbarms\", type=int, help=\"number of arms\", default=2)\n    #parser.add_argument(\"--nbseq\", type=int, help=\"number of sequences between reinitializations of hidden/Hebbian state and position\", default=3)\n    #parser.add_argument(\"--activ\", help=\"activ function ('tanh' or 'selu')\", default='tanh')\n    parser.add_argument(\"--alg\", help=\"meta-learning algorithm (A3C or REI or REIE or REIT)\", default='REIT')\n    parser.add_argument(\"--rule\", help=\"learning rule ('hebb' or 'oja')\", default='hebb')\n    parser.add_argument(\"--type\", help=\"network type ('lstm' or 'rnn' or 'plastic')\", default='modul')\n    #parser.add_argument(\"--msize\", type=int, help=\"size of the maze; must be odd\", default=9)\n    parser.add_argument(\"--da\", help=\"transformation function of DA signal (tanh or sig or lin)\", default='tanh')\n    parser.add_argument(\"--gr\", type=float, help=\"gammaR: discounting factor for rewards\", default=.9)\n    parser.add_argument(\"--lr\", type=float, help=\"learning rate (Adam optimizer)\", default=1e-4)\n    parser.add_argument(\"--fm\", type=int, help=\"if using neuromodulation, do we modulate the whole network (1) or just half (0) ?\", default=1)\n    #parser.add_argument(\"--na\", type=int, help=\"number of actions (excluding \\\"rest\\\" action)\", default=2)\n    parser.add_argument(\"--ni\", type=int, help=\"number of different inputs\", default=2)\n    parser.add_argument(\"--nu\", type=float, help=\"REINFORCE baseline time constant\", default=.1)\n    #parser.add_argument(\"--samestep\", type=int, help=\"compare stimulus and response in the same step (1) or from successive steps (0) ?\", default=0)\n    #parser.add_argument(\"--nbin\", type=int, help=\"number of possible inputs stimulis\", default=4)\n    #parser.add_argument(\"--modhalf\", type=int, help=\"which half of the recurrent netowkr receives modulation (1 or 2)\", default=1)\n    #parser.add_argument(\"--nbac\", type=int, help=\"number of possible non-rest actions\", default=4)\n    #parser.add_argument(\"--rsp\", type=int, help=\"does the agent start each episode from random position (1) or center (0) ?\", default=1)\n    parser.add_argument(\"--addpw\", type=int, help=\"are plastic weights purely additive (1) or forgetting (0) ?\", default=2)\n    parser.add_argument(\"--clamp\", type=int, help=\"inputs clamped (1), fully clamped (2) or through linear layer (0) ?\", default=0)\n    parser.add_argument(\"--eplen\", type=int, help=\"length of episodes\", default=100)\n    #parser.add_argument(\"--exptime\", type=int, help=\"exploration (no reward) time (must be < eplen)\", default=0)\n    parser.add_argument(\"--hs\", type=int, help=\"size of the recurrent (hidden) layer\", default=100)\n    parser.add_argument(\"--is\", type=int, help=\"do we initialize hidden state after each trial (1) or not (0) ?\", default=0)\n    parser.add_argument(\"--cs\", type=int, help=\"cue size - number of bits for each cue\", default=10)\n    parser.add_argument(\"--pf\", type=float, help=\"probability of flipping the reward (.5 = pure noise)\", default=0)\n    parser.add_argument(\"--l2\", type=float, help=\"coefficient of L2 norm (weight decay)\", default=1e-5)\n    parser.add_argument(\"--bs\", type=int, help=\"batch size\", default=1)\n    parser.add_argument(\"--gc\", type=float, help=\"gradient clipping\", default=1000.0)\n    parser.add_argument(\"--eps\", type=float, help=\"epsilon for Adam optimizer\", default=1e-6)\n    #parser.add_argument(\"--steplr\", type=int, help=\"duration of each step in the learning rate annealing schedule\", default=100000000)\n    #parser.add_argument(\"--gamma\", type=float, help=\"learning rate annealing factor\", default=0.3)\n    parser.add_argument(\"--nbiter\", type=int, help=\"number of learning cycles\", default=1000000)\n    parser.add_argument(\"--save_every\", type=int, help=\"number of cycles between successive save points\", default=200)\n    parser.add_argument(\"--pe\", type=int, help=\"'print every', number of cycles between successive printing of information\", default=100)\n    #parser.add_argument(\"--\", type=int, help=\"\", default=1e-4)\n    args = parser.parse_args(); argvars = vars(args); argdict =  { k : argvars[k] for k in argvars if argvars[k] != None }\n    #train()\n    train(argdict)\n\n"
  },
  {
    "path": "sr/srrun1episode.py",
    "content": "import argparse\nimport pdb\nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Variable\nimport numpy as np\nfrom numpy import random\nimport torch.nn.functional as F\nfrom torch import optim\nfrom torch.optim import lr_scheduler\nimport random\nimport sys\nimport pickle\nimport time\nimport os\nimport platform\n#import makemaze\n\nimport numpy as np\n#import matplotlib.pyplot as plt\nimport glob\n\nimport modul\n\n\n\n\nnp.set_printoptions(precision=4)\n\n\n\nADDINPUT = 4 # 1 inputs for the previous reward, 1 inputs for numstep, 1 unused,  1 \"Bias\" inputs\n\n\ndef train(paramdict):\n\n    cuesshownall = []; rewardsprevstepall = []; modulatorall=[]\n\n    for numrun in range(4):\n        #params = dict(click.get_current_context().params)\n\n        #params['inputsize'] =  RFSIZE * RFSIZE + ADDINPUT + NBNONRESTACTIONS\n\n        #suffix = 'SRB_addpw_2_alg_A3C_bent_0.1_blossv_0.1_bs_30_bv_0.1_clamp_0_cs_20_da_tanh_eplen_120_eps_1e-06_fm_1_gc_2.0_gr_0.9_hs_200_is_0_l2_0.0_lr_0.0001_nbiter_200000_ni_4_nu_0.1_pf_0.0_rew_1.0_rule_hebb_type_modul_wp_0.0_rngseed_11'\n\n        suffix = 'SRB_addpw_2_alg_A3C_bent_0.1_blossv_0.1_bs_30_bv_0.1_clamp_0_cs_20_da_tanh_eplen_120_eps_1e-06_fm_1_gc_2.0_gr_0.9_hs_200_is_0_l2_0.0_lr_0.0001_nbiter_200000_ni_4_nu_0.1_pe_500_pf_0.0_rew_1.0_rule_hebb_type_modplast_wp_0.0_rngseed_'+str(numrun)\n        \n        print(\"Starting training...\")\n        params = {}\n        #params.update(defaultParams)\n        params.update(paramdict)\n\n        with open('./tmp/params_'+suffix+'.dat', 'rb') as fo:\n            params = pickle.load(fo)\n\n        params['nbiter'] = 1\n        params['bs'] = 1\n\n\n        print(\"Used params: \", params)\n        print(platform.uname())\n        #params['nbsteps'] = params['nbshots'] * ((params['prestime'] + params['interpresdelay']) * params['nbclasses']) + params['prestimetest']  # Total number of steps per episode\n        #NBINPUTBITS = params['ni'] + 1 \n        NBINPUTBITS = params['cs'] + 1 # The additional bit is for the response cue (i.e. the \"Go\" cue)\n        params['outputsize'] =  2  # \"response\" and \"no response\"\n        params['inputsize'] = NBINPUTBITS +  params['outputsize'] + ADDINPUT  # The total number of input bits is the size of inputs, plus the \"response cue\" input, plus the number of actions, plus the number of additional inputs\n\n        # This doesn't work with our version of PyTorch\n        #params['device'] = 'gpu'\n        #device = torch.device(\"cuda:0\" if self.params['device'] == 'gpu' else \"cpu\")\n        BS = params['bs']\n\n        # Initialize random seeds (first two redundant?)\n        print(\"Setting random seeds\")\n        np.random.seed(params['rngseed']); random.seed(params['rngseed']); torch.manual_seed(params['rngseed'])\n        #print(click.get_current_context().params)\n\n        print(\"Initializing network\")\n        if params['type'] == 'modul':\n            net = modul.RetroModulRNN(params)\n        elif params['type'] == 'modplast':\n            net = modul.SimpleModulRNN(params)\n        elif params['type'] == 'plastic':\n            net = modul.PlasticRNN(params)\n        elif params['type'] == 'rnn':\n            net = modul.NonPlasticRNN(params)\n        else:\n            raise ValueError(\"Network type unknown or not yet implemented: \"+params['type'])\n\n        net.load_state_dict(torch.load('./tmp/torchmodel_'+suffix+'.dat'))\n\n        print (\"Shape of all optimized parameters:\", [x.size() for x in net.parameters()])\n        allsizes = [torch.numel(x.data.cpu()) for x in net.parameters()]\n        print (\"Size (numel) of all optimized elements:\", allsizes)\n        print (\"Total size (numel) of all optimized elements:\", sum(allsizes))\n\n        #total_loss = 0.0\n        print(\"Initializing optimizer\")\n       \n        #optimizer = torch.optim.Adam(net.parameters(), lr=1.0*params['lr'], eps=params['eps'], weight_decay=params['l2'])\n\n\n        all_losses = []\n        all_grad_norms = []\n        all_losses_objective = []\n        all_total_rewards = []\n        all_losses_v = []\n        lossbetweensaves = 0\n        nowtime = time.time()\n        #meanreward = np.zeros((LABSIZE, LABSIZE))\n        meanreward = np.zeros(params['ni'])\n        meanrewardT = np.zeros((params['ni'], params['eplen']))\n\n        nbtrials = [0]*BS\n        totalnbtrials = 0\n        nbtrialswithcc = 0\n\n\n        print(\"Starting episodes!\")\n\n        for numepisode in range(params['nbiter']):\n\n            PRINTTRACE = 1\n            #if (numepisode+1) % (params['pe']) == 0:\n            #    PRINTTRACE = 1\n\n            #optimizer.zero_grad()\n            loss = 0\n            lossv = 0\n            hidden = net.initialZeroState()\n            if params['type'] != 'rnn':\n                hebb = net.initialZeroHebb()\n            if params['type'] == 'modul':\n                et = net.initialZeroHebb() # Eligibility Trace is identical to Hebbian Trace in shape\n                pw = net.initialZeroPlasticWeights()\n            numactionchosen = 0\n\n\n            # Generate the cues. Make sure they're all different (important when using very small cues for debugging, e.g. cs=2, ni=2)\n            cuedata=[]\n            for nb in range(BS):\n                cuedata.append([])\n                for ncue in range(params['ni']):\n                    assert len(cuedata[nb]) == ncue\n                    foundsame = 1\n                    cpt = 0\n                    while foundsame > 0 :\n                        cpt += 1\n                        if cpt > 10000:\n                            # This should only occur with very weird parameters, e.g. cs=2, ni>4\n                            raise ValueError(\"Could not generate a full list of different cues\")\n                        foundsame = 0\n                        candidate = np.random.randint(2, size=params['cs']) * 2 - 1\n                        for backtrace in range(ncue):\n                            if np.array_equal(cuedata[nb][backtrace], candidate):\n                                foundsame = 1\n\n                    cuedata[nb].append(candidate)\n\n\n            reward = np.zeros(BS)\n            sumreward = np.zeros(BS)\n            rewards = []\n            vs = []\n            logprobs = []\n            cues=[]\n            for nb in range(BS):\n                cues.append([])\n            dist = 0\n            numactionschosen = np.zeros(BS, dtype='int32')\n\n            #reward = 0.0\n            #rewards = []\n            #vs = []\n            #logprobs = []\n            #sumreward = 0.0\n            nbtrials = np.zeros(BS)\n            nbrewardabletrials = np.zeros(BS)\n            thistrialhascorrectcue = np.zeros(BS)\n            triallength = np.zeros(BS, dtype='int32')\n            correctcue = np.random.randint(params['ni'], size=BS)\n\n            trialstep = np.zeros(BS, dtype='int32')  \n\n            modulator0 = []\n            cuesshown0 = []\n            rewardsprevstep0 = []\n\n            #print(\"EPISODE \", numepisode)\n            for numstep in range(params['eplen']):\n\n                #if params['clamp'] == 0:\n                inputs = np.zeros((BS, params['inputsize']), dtype='float32') \n                #else:\n                #    inputs = np.zeros((1, params['hs']), dtype='float32')\n\n                for nb in range(BS):\n                \n                    if trialstep[nb] == 0:\n                        thistrialhascorrectcue[nb] = 0\n                        # Trial length is randomly modulated for each trial; first time step always -1 (i.e. no input cue), last time step always response-cue (i.e. NBINPUTBITS-1).\n                        #triallength = params['ni'] // 2  + 3 + np.random.randint(1 + params['ni'])  # 3 fixed-cue time steps (1st, last and next-to-last) + some random nb of no-cue time steps\n                        triallength[nb] = params['ni'] // 2  + 3 + np.random.randint(params['ni'])  # 3 fixed-cue time steps (1st, last and next-to-last) + some random nb of no-cue time steps\n                        \n                        \n                        \n                        # In any trial, we only show half the cues (randomly chosen), once each:\n                        mycues = [x for x in range(params['ni'])]\n                        random.shuffle(mycues); mycues = mycues[:len(mycues) // 2]\n                        # The rest is filled with no-input time steps (i.e. cue = -1), but also with the 3 fixed-cue steps (1st, last, next-to-last) \n                        for nc in range(triallength[nb] - 3  - len(mycues)):\n                            mycues.append(-1)\n                        random.shuffle(mycues)\n                        mycues.insert(0, -1); mycues.append(params['ni']); mycues.append(-1)  # The first and last time step have no input (cue -1), the next-to-last has the response cue.\n                        assert(len(mycues) == triallength[nb])\n                        cues[nb] = mycues\n\n                    inputs[nb, :NBINPUTBITS] = 0\n                    if cues[nb][trialstep[nb]] > -1 and cues[nb][trialstep[nb]] < params['ni']:\n                        #inputs[0, cues[trialstep]] = 1.0\n                        inputs[nb, :NBINPUTBITS-1] = cuedata[nb][cues[nb][trialstep[nb]]][:]\n                        if cues[nb][trialstep[nb]] == correctcue[nb]:\n                            thistrialhascorrectcue[nb] = 1\n                    if cues[nb][trialstep[nb]] == params['ni']:\n                        inputs[nb, NBINPUTBITS-1] = 1  # \"Go\" cue\n                        \n\n                    inputs[nb, NBINPUTBITS + 0] = 1.0 # Bias neuron, probably not necessary\n                    inputs[nb,NBINPUTBITS +  1] = numstep / params['eplen']\n                    inputs[nb, NBINPUTBITS + 2] = 1.0 * reward[nb] # Reward from previous time step\n                    if numstep > 0:\n                        inputs[nb, NBINPUTBITS + ADDINPUT + numactionschosen[nb]] = 1  # Previously chosen action\n\n                inputsC = torch.from_numpy(inputs).cuda()\n                # Might be better:\n                #if rposr == posr and rposc = posc:\n                #    inputs[0][-4] = 100.0\n                #else:\n                #    inputs[0][-4] = 0\n                \n                # Running the network\n\n                ## Running the network\n                if params['type'] == 'modplast':\n                    y, v, DAout, hidden, hebb = net(Variable(inputsC, requires_grad=False), hidden, hebb)  # y  should output raw scores, not probas\n                elif params['type'] == 'modul':\n                    y, v, DAout, hidden, hebb, et, pw  = net(Variable(inputsC, requires_grad=False), hidden, hebb, et, pw)  # y  should output raw scores, not probas\n                elif params['type'] == 'plastic':\n                    y, v, hidden, hebb = net(Variable(inputsC, requires_grad=False), hidden, hebb)  # y  should output raw scores, not probas\n                elif params['type'] == 'rnn':\n                    y, v, hidden = net(Variable(inputsC, requires_grad=False), hidden)  # y  should output raw scores, not probas\n                else:\n                    raise ValueError(\"Network type unknown or not yet implemented!\")\n\n\n\n                y = F.softmax(y, dim=1)\n                # Must convert y to probas to use this !\n                distrib = torch.distributions.Categorical(y)\n                actionschosen = distrib.sample()  \n                logprobs.append(distrib.log_prob(actionschosen))\n                numactionschosen = actionschosen.data.cpu().numpy()    # Turn to scalar\n\n                if PRINTTRACE:\n                    print(\"Step \", numstep, \" Inputs (1st in batch): \", inputs[0,:params['inputsize']], \" - Outputs(0): \", y.data.cpu().numpy()[0,:], \" - action chosen(0): \", numactionschosen[0],\n                            \"TrialLen(0):\", triallength[0], \"trialstep(0):\", trialstep[0], \"TTHCC(0): \", thistrialhascorrectcue[0], \" -Reward (previous step): \", reward[0], \", cues(0):\", cues[0], \", cc(0):\", correctcue[0])\n\n\n                    #print(\"Step \", numstep, \" Inputs: \", inputs[0,:params['inputsize']], \" - Outputs: \", y.data.cpu().numpy(), \" - action chosen: \", numactionchosen,\n                    #        \" - mean abs pw: \", np.mean(np.abs(pw.data.cpu().numpy())), \"TrialLen:\", triallength, \"trialstep:\", trialstep, \"TTHCC: \", thistrialhascorrectcue, \" -Reward (previous step): \", reward, \", cues:\", cues, \", cc:\", correctcue)\n\n                cuesshown0.append(cues[0][trialstep[0]])\n                rewardsprevstep0.append(float(reward[0]))\n                modulator0.append(float(DAout[0]))\n                \n                reward = np.zeros(BS, dtype='float32')\n                \n                \n\n                for nb in range(BS):\n                    if numactionschosen[nb] == 1:\n                        # Small penalty for any non-rest action taken\n                        reward[nb]  -= params['wp']\n                \n                \n                ### DEBUGGING\n                ## Easiest possible episode-dependent response (i.e. the easiest\n                ## possible problem that actually require meta-learning, with ni=2)\n                ## This one works pretty wel... But harder ones don't work well!\n                #if numactionchosen == correctcue :\n                #        reward = params['rew']\n                #else:\n                #        reward = -params['rew']\n\n\n                    trialstep[nb] += 1\n                    if trialstep[nb] == triallength[nb] - 1:\n                        # This was the next-to-last step of the trial (and we showed the response signal, unless it was the first few steps in episode). \n                        assert(cues[nb][trialstep[nb] - 1] == params['ni'] or numstep < 2)\n                        # We must deliver reward (which will be perceived by the agent at the next step), positive or negative, depending on response\n                        if thistrialhascorrectcue[nb] and numactionschosen[nb] == 1:\n                            reward[nb] += params['rew']\n                        elif (not thistrialhascorrectcue[nb]) and numactionschosen[nb] == 0:\n                            reward[nb] += params['rew']\n                        else:\n                            reward[nb] -= params['rew']\n\n                        if np.random.rand() < params['pf']:\n                            reward[nb] = -reward[nb]\n                    \n                    if trialstep[nb] == triallength[nb]:\n                        # This was the last step of the trial (and we showed no input)\n                        assert(cues[nb][trialstep[nb] - 1] == -1 or numstep < 2)\n                        nbtrials[nb] += 1\n                        totalnbtrials += 1\n                        if thistrialhascorrectcue[nb]:\n                            nbtrialswithcc += 1\n                            #nbrewardabletrials += 1 \n                        # Trial is dead, long live trial\n                        trialstep[nb] = 0\n\n                        # We initialize the hidden state between trials!\n                        #if params['is'] == 1:\n                        #    hidden = net.initialZeroState()\n\n\n\n                rewards.append(reward)\n                vs.append(v)\n                sumreward += reward\n\n\n\n                #if params['alg'] in ['A3C' , 'REIE' , 'REIT']:\n                \n                loss += (params['bent'] * y.pow(2).sum() / BS )   # We want to penalize concentration, i.e. encourage diversity; our version of PyTorch does not have an entropy() function for Distribution, so we use this instead.\n\n                \n\n                ##if PRINTTRACE:\n                ##    print(\"Probabilities:\", y.data.cpu().numpy(), \"Picked action:\", numactionchosen, \", got reward\", reward)\n            \n            R = Variable(torch.zeros(BS).cuda(), requires_grad=False)\n            gammaR = params['gr']\n            for numstepb in reversed(range(params['eplen'])) :\n                R = gammaR * R + Variable(torch.from_numpy(rewards[numstepb]).cuda(), requires_grad=False)\n                ctrR = R - vs[numstepb][0]\n                lossv += ctrR.pow(2).sum() / BS\n                loss -= (logprobs[numstepb] * ctrR.detach()).sum() / BS  # Need to check if detach() is OK\n                #pdb.set_trace()\n\n\n            # Episode is done, now let's do the actual computations\n            #gammaR = params['gr']\n            #if params['alg'] == 'A3C':\n            #    R = 0\n            #    for numstepb in reversed(range(params['eplen'])) :\n            #        R = gammaR * R + rewards[numstepb]\n            #        lossv += (vs[numstepb][0] - R).pow(2)\n            #        loss -= logprobs[numstepb] * (R - vs[numstepb].data[0][0])  # Not sure if the \"data\" is needed... put it b/c of worry about weird gradient flows\n            #    loss += params['bv'] * lossv\n\n            #elif params['alg'] in ['REI', 'REIE']:\n            #    R = sumreward\n            #    baseline = meanreward[correctcue]\n            #    for numstepb in reversed(range(params['eplen'])) :\n            #        loss -= logprobs[numstepb] * (R - baseline)\n            #elif params['alg'] == 'REIT':\n            #    R = 0\n            #    for numstepb in reversed(range(params['eplen'])) :\n            #        R = gammaR * R + rewards[numstepb]\n            #        loss -= logprobs[numstepb] * (R - meanrewardT[correctcue, numstepb])\n            #else:\n            #    raise ValueError(\"Must select algo type\")\n            #elif params['alg'] == 'REINOB':\n            #    R = sumreward\n            #    for numstepb in reversed(range(params['eplen'])) :\n            #        loss -= logprobs[numstepb] * R\n            #elif params['alg'] == 'REITMP':\n            #    R = 0\n            #    for numstepb in reversed(range(params['eplen'])) :\n            #        R = gammaR * R + rewards[numstepb]\n            #        loss -= logprobs[numstepb] * R\n\n            #else:\n            #    raise ValueError(\"Which algo?\")\n\n            #meanreward[correctcue] = (1.0 - params['nu']) * meanreward[correctcue] + params['nu'] * sumreward\n            ##meanreward[rposr, rposc] = (1.0 - params['nu']) * meanreward[rposr, rposc] + params['nu'] * sumreward\n            #R = 0\n            #for numstepb in reversed(range(params['eplen'])) :\n            #    R = gammaR * R + rewards[numstepb]\n            #    meanrewardT[correctcue, numstepb] = (1.0 - params['nu']) * meanrewardT[correctcue, numstepb] + params['nu'] * R\n\n            loss += params['blossv'] * lossv\n            loss /= params['eplen']\n\n            if PRINTTRACE:\n                #if params['alg'] == 'A3C':\n                print(\"lossv: \", float(lossv))\n                #elif params['alg'] in ['REI', 'REIE', 'REIT']:\n                #    print(\"meanreward baselines: \", [meanreward[x] for x in range(params['ni'])])\n                print (\"Total reward for this episode(0):\", sumreward[0], \"Prop. of trials w/ rewarded cue:\", (nbtrialswithcc / totalnbtrials))\n                #print(\"Nb trials for this episode(0):\", nbtrials[0], \"[2]:\",nbtrials[2],\" Total Nb of trials:\", totalnbtrials)\n\n            #if params['squash'] == 1:\n            #    if sumreward < 0:\n            #        sumreward = -np.sqrt(-sumreward)\n            #    else:\n            #        sumreward = np.sqrt(sumreward)\n            #elif params['squash'] == 0:\n            #    pass\n            #else:\n            #    raise ValueError(\"Incorrect value for squash parameter\")\n\n            #loss *= sumreward\n\n            #loss.backward()\n            #all_grad_norms.append(torch.nn.utils.clip_grad_norm(net.parameters(), params['gc']))\n            #if numepisode > 100:  # Burn-in period for meanreward\n            #    optimizer.step()\n\n\n            #print(sumreward)\n            lossnum = float(loss)\n            lossbetweensaves += lossnum\n            all_losses_objective.append(lossnum)\n            all_total_rewards.append(sumreward.mean())\n            #all_total_rewards.append(sumreward[0])\n                #all_losses_v.append(lossv.data[0])\n            #total_loss  += lossnum\n\n\n\n            if (numepisode+1) % params['pe'] == 0:\n\n                print(numepisode, \"====\")\n                print(\"Mean loss: \", lossbetweensaves / params['pe'])\n                lossbetweensaves = 0\n                print(\"Mean reward: \", np.sum(all_total_rewards[-params['pe']:])/ params['pe'])\n                previoustime = nowtime\n                nowtime = time.time()\n                print(\"Time spent on last\", params['pe'], \"iters: \", nowtime - previoustime)\n                if params['type'] == 'plastic' or params['type'] == 'lstmplastic':\n                    print(\"ETA: \", float(net.eta), \"alpha[0,1]: \", net.alpha.data.cpu().numpy()[0,1], \"w[0,1]: \", net.w.data.cpu().numpy()[0,1] )\n                elif params['type'] == 'modul' or params['type'] == 'modul2':\n                    print(\"ETA: \", net.eta.data.cpu().numpy(), \" etaet: \", net.etaet.data.cpu().numpy(), \" mean-abs pw: \", np.mean(np.abs(pw.data.cpu().numpy())))\n                elif params['type'] == 'rnn':\n                    print(\"w[0,1]: \", net.w.data.cpu().numpy()[0,1] )\n            \n            if (numepisode+1) % params['save_every'] == 0:\n                print(\"Saving files...\")\n    #            lossbetweensaves /= params['save_every']\n    #            print(\"Average loss over the last\", params['save_every'], \"episodes:\", lossbetweensaves)\n    #            print(\"Alternative computation (should be equal):\", np.mean(all_losses_objective[-params['save_every']:]))\n                losslast100 = np.mean(all_losses_objective[-100:])\n                print(\"Average loss over the last 100 episodes:\", losslast100)\n    #            # Instability detection; necessary for SELUs, which seem to be divergence-prone\n    #            # Note that if we are unlucky enough to have diverged within the last 100 timesteps, this may not save us.\n    #            if losslast100 > 2 * lossbetweensavesprev:\n    #                print(\"We have diverged ! Restoring last savepoint!\")\n    #                net.load_state_dict(torch.load('./torchmodel_'+suffix + '.txt'))\n    #            else:\n                print(\"NOT saving files!\")\n    #            lossbetweensavesprev = lossbetweensaves\n    #            lossbetweensaves = 0\n    #            sys.stdout.flush()\n    #            sys.stderr.flush()\n\n        modulatorall.append(modulator0)\n        cuesshownall.append(cuesshown0)\n\n        rewardsprevstepall.append(rewardsprevstep0)\n\n    np.save('cueshown0.dat', np.array(cuesshownall))\n    np.save('modulator0.dat', np.array(modulatorall))\n    np.save('rewardsprevstep0.dat', np.array(rewardsprevstepall))\n\n\nif __name__ == \"__main__\":\n#defaultParams = {\n#    'type' : 'lstm',\n#    'seqlen' : 200,\n#    'hs': 500,\n#    'activ': 'tanh',\n#    'steplr': 10e9,  # By default, no change in the learning rate\n#    'gamma': .5,  # The annealing factor of learning rate decay for Adam\n#    'imagesize': 31,\n#    'nbiter': 30000,\n#    'lr': 1e-4,\n#    'test_every': 10,\n#    'save_every': 3000,\n#    'rngseed':0\n#}\n\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"--rngseed\", type=int, help=\"random seed\", default=0)\n    #parser.add_argument(\"--clamp\", type=float, help=\"maximum (absolute value) gradient for clamping\", default=1000000.0)\n    #parser.add_argument(\"--wp\", type=float, help=\"wall penalty (reward decrement for hitting a wall)\", default=0.1)\n    parser.add_argument(\"--rew\", type=float, help=\"reward value (reward increment for taking correct action after correct stimulus)\", default=1.0)\n    parser.add_argument(\"--wp\", type=float, help=\"penalty for hitting walls\", default=.0)\n    #parser.add_argument(\"--pen\", type=float, help=\"penalty value (reward decrement for taking any non-rest action)\", default=.2)\n    #parser.add_argument(\"--exprew\", type=float, help=\"reward value (reward increment for hitting reward location)\", default=.0)\n    parser.add_argument(\"--bent\", type=float, help=\"coefficient for the entropy reward (really Simpson index concentration measure)\", default=0.03)\n    parser.add_argument(\"--blossv\", type=float, help=\"coefficient for value prediction loss\", default=.1)\n    #parser.add_argument(\"--probarev\", type=float, help=\"probability of reversal (random change) in desired stimulus-response, per time step\", default=0.0)\n    parser.add_argument(\"--bv\", type=float, help=\"coefficient for value prediction loss\", default=.1)\n    #parser.add_argument(\"--lsize\", type=int, help=\"size of the labyrinth; must be odd\", default=7)\n    #parser.add_argument(\"--randstart\", type=int, help=\"when hitting reward, should we teleport to random location (1) or center (0)?\", default=0)\n    #parser.add_argument(\"--rp\", type=int, help=\"whether the reward should be on the periphery\", default=0)\n    #parser.add_argument(\"--squash\", type=int, help=\"squash reward through signed sqrt (1 or 0)\", default=0)\n    #parser.add_argument(\"--nbarms\", type=int, help=\"number of arms\", default=2)\n    #parser.add_argument(\"--nbseq\", type=int, help=\"number of sequences between reinitializations of hidden/Hebbian state and position\", default=3)\n    #parser.add_argument(\"--activ\", help=\"activ function ('tanh' or 'selu')\", default='tanh')\n    parser.add_argument(\"--alg\", help=\"meta-learning algorithm (A3C or REI or REIE or REIT)\", default='REIT')\n    parser.add_argument(\"--rule\", help=\"learning rule ('hebb' or 'oja')\", default='hebb')\n    parser.add_argument(\"--type\", help=\"network type ('lstm' or 'rnn' or 'plastic')\", default='modul')\n    #parser.add_argument(\"--msize\", type=int, help=\"size of the maze; must be odd\", default=9)\n    parser.add_argument(\"--da\", help=\"transformation function of DA signal (tanh or sig or lin)\", default='tanh')\n    parser.add_argument(\"--gr\", type=float, help=\"gammaR: discounting factor for rewards\", default=.9)\n    parser.add_argument(\"--lr\", type=float, help=\"learning rate (Adam optimizer)\", default=1e-4)\n    parser.add_argument(\"--fm\", type=int, help=\"if using neuromodulation, do we modulate the whole network (1) or just half (0) ?\", default=1)\n    #parser.add_argument(\"--na\", type=int, help=\"number of actions (excluding \\\"rest\\\" action)\", default=2)\n    parser.add_argument(\"--ni\", type=int, help=\"number of different inputs\", default=2)\n    parser.add_argument(\"--nu\", type=float, help=\"REINFORCE baseline time constant\", default=.1)\n    #parser.add_argument(\"--samestep\", type=int, help=\"compare stimulus and response in the same step (1) or from successive steps (0) ?\", default=0)\n    #parser.add_argument(\"--nbin\", type=int, help=\"number of possible inputs stimulis\", default=4)\n    #parser.add_argument(\"--modhalf\", type=int, help=\"which half of the recurrent netowkr receives modulation (1 or 2)\", default=1)\n    #parser.add_argument(\"--nbac\", type=int, help=\"number of possible non-rest actions\", default=4)\n    #parser.add_argument(\"--rsp\", type=int, help=\"does the agent start each episode from random position (1) or center (0) ?\", default=1)\n    parser.add_argument(\"--addpw\", type=int, help=\"are plastic weights purely additive (1) or forgetting (0) ?\", default=2)\n    parser.add_argument(\"--clamp\", type=int, help=\"inputs clamped (1), fully clamped (2) or through linear layer (0) ?\", default=0)\n    parser.add_argument(\"--eplen\", type=int, help=\"length of episodes\", default=100)\n    #parser.add_argument(\"--exptime\", type=int, help=\"exploration (no reward) time (must be < eplen)\", default=0)\n    parser.add_argument(\"--hs\", type=int, help=\"size of the recurrent (hidden) layer\", default=100)\n    parser.add_argument(\"--is\", type=int, help=\"do we initialize hidden state after each trial (1) or not (0) ?\", default=0)\n    parser.add_argument(\"--cs\", type=int, help=\"cue size - number of bits for each cue\", default=10)\n    parser.add_argument(\"--pf\", type=float, help=\"probability of flipping the reward (.5 = pure noise)\", default=0)\n    parser.add_argument(\"--l2\", type=float, help=\"coefficient of L2 norm (weight decay)\", default=1e-5)\n    parser.add_argument(\"--bs\", type=int, help=\"batch size\", default=1)\n    parser.add_argument(\"--gc\", type=float, help=\"gradient clipping\", default=1000.0)\n    parser.add_argument(\"--eps\", type=float, help=\"epsilon for Adam optimizer\", default=1e-6)\n    #parser.add_argument(\"--steplr\", type=int, help=\"duration of each step in the learning rate annealing schedule\", default=100000000)\n    #parser.add_argument(\"--gamma\", type=float, help=\"learning rate annealing factor\", default=0.3)\n    parser.add_argument(\"--nbiter\", type=int, help=\"number of learning cycles\", default=1000000)\n    parser.add_argument(\"--save_every\", type=int, help=\"number of cycles between successive save points\", default=200)\n    parser.add_argument(\"--pe\", type=int, help=\"'print every', number of cycles between successive printing of information\", default=100)\n    #parser.add_argument(\"--\", type=int, help=\"\", default=1e-4)\n    args = parser.parse_args(); argvars = vars(args); argdict =  { k : argvars[k] for k in argvars if argvars[k] != None }\n    #train()\n    train(argdict)\n\n"
  }
]