[
  {
    "path": ".gitignore",
    "content": "# Linux\n*.swp\n*.swo\n*.swn\n\n# Virtual environment\n/[Ii]nclude\n/[Ll]ib\n/Scripts\n/bin\n/local\n/man\n/share\n\n# Python\n__pycache__/\n*.py[cod]\n\n# Pip\npip-selfcheck.json\n*.whl\n*.egg\n*.egg-info\n\n# Pytest\n.cache\n.coverage\n.coverage*\n\n# Setuptools\n/build\n/dist\n*.eggs\n\n# Sphinx\ndoc/_build\n"
  },
  {
    "path": ".travis.yml",
    "content": "language: python\npython:\n  - \"3.5\"\n  - \"3.4\"\n  - \"3.3\"\ninstall:\n  - pip install coveralls\nscript:\n  - python setup.py test\n  - python setup.py install\n  - python setup.py build_sphinx\nafter_success:\n  - coveralls\n"
  },
  {
    "path": "LICENSE.md",
    "content": "The MIT License (MIT)\n\nCopyright (c) 2015 Danijar Hafner\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "README.md",
    "content": "[![Build Status][1]][2]\n[![Code Climate][3]][4]\n[![Documentation][5]][6]\n\n[1]: https://travis-ci.org/danijar/layered.svg?branch=master\n[2]: https://travis-ci.org/danijar/layered\n[3]: https://codeclimate.com/github/danijar/layered/badges/gpa.svg\n[4]: https://codeclimate.com/github/danijar/layered\n[5]: https://readthedocs.org/projects/pip/badge/\n[6]: https://layered.readthedocs.org/en/latest/\n\nLayered\n=======\n\nThis project aims to be a clean and modular implementation of feed forward\nneural networks. It's written in Python 3 and published under the MIT license.\nI started this project in order to understand the concepts of deep learning.\nYou can use this repository as guidance if you want to implement neural\nnetworks what I highly recommend if you are interested in understanding them.\n\nInstructions\n------------\n\nThis will train a network with 1.3M weights to classify handwritten digits and\nvisualize the progress. After a couple of minutes, the error should drop below\n3%. To install globally, just skip the first command. Solutions to all reported\nproblems can be found in the troubleshooting section.\n\n```bash\nvirtualenv . -p python3 --system-site-packages && source bin/activate\npip3 install layered\ncurl -o mnist.yaml -L http://git.io/vr7y1\nlayered mnist.yaml -v\n```\n\n### Problem Definition\n\nLearning problems are defined in YAML files and it's easy to create your own.\nAn overview of available cost and activation functions is available a few\nsections below.\n\n```yaml\ndataset: Mnist\ncost: CrossEntropy\nlayers:\n- activation: Identity\n  size: 784\n- activation: Relu\n  size: 700\n- activation: Relu\n  size: 700\n- activation: Relu\n  size: 400\n- activation: Softmax\n  size: 10\nepochs: 5\nbatch_size: 32\nlearning_rate: 0.01\nmomentum: 0.9\nweight_scale: 0.01\nweight_decay: 0\nevaluate_every: 5000\n```\n\n### Command Line Arguments\n\n```\nlayered [-h] [-v] [-l weights.npy] [-s weights.npy] problem.yaml\n```\n\n| Short | Long | Description |\n| :---- | :--- | :---------- |\n| `-h` | `--help` | Print usage instructions |\n| `-v` | `--visual` | Show a diagram of trainig costs and testing error |\n| `-l` | `--load` | Path to load learned weights from at startup |\n| `-s` | `--save` | Path to dump the learned weights at each evaluation |\n\n### Contribution\n\nOptionally, create a virtual environment. Then install the dependencies. The\nlast command is to see if everything works.\n\n```bash\ngit clone https://github.com/danijar/layered.git && cd layered\nvirtualenv . -p python3 --system-site-packages && source bin/activate\npip3 install -e .\npython3 -m layered problem/modulo.yaml -v\n```\n\nNow you can start playing around with the code. For pull requests, please\nsquash the changes to a single commit and ensure that the linters and tests are\npassing.\n\n```bash\npython setup.py test\n```\n\nIf you have questions, feel free to contact me.\n\nAdvanced Guide\n--------------\n\nIn this guide you will learn how to create and train models manually rather\nthan using the problem definitions to gain more insight into training neural\nnetworks. Let's start!\n\n### Step 1: Network Definition\n\nA network is defined by its layers. The parameters for a layer are the amount\nof neurons and the activation function. The first layer has the identity\nfunction since we don't want to already modify the input data before feeding it\nin.\n\n```python\nfrom layered.network import Network\nfrom layered.activation import Identity, Relu, Softmax\n\nnum_inputs = 784\nnum_outputs = 10\n\nnetwork = Network([\n    Layer(num_inputs, Identity),\n    Layer(700, Relu),\n    Layer(500, Relu),\n    Layer(300, Relu),\n    Layer(num_outputs, Softmax),\n])\n```\n\n### Step 2: Activation Functions\n\n| Function | Description | Definition | __________Graph__________ |\n| -------- | ----------- | :--------: | ------------------------- |\n| Identity | Don't transform the incoming data. That's what you would expect at input layers. | x | ![Identity](image/identity.png) |\n| Relu | Fast non-linear function that has proven to be effective in deep networks. | max(0, x) | ![Relu](image/relu.png) |\n| Sigmoid | The de facto standard activation before Relu came up. Smoothly maps the incoming activation into a range from zero to one. | 1 / (1 + exp(-x)) | ![Sigmoid](image/sigmoid.png) |\n| Softmax | Smooth activation function where the outgoing activations sum up to one. It's commonly used for output layers in classification because the outgoing activations can be interpreted as probabilities. | exp(x) / sum(exp(x)) | ![Softmax](image/softmax.png) |\n\n### Step 3: Weight Initialization\n\nThe weight matrices of the network are handed to algorithms like\nbackpropagation, gradient descent and weight decay. If the initial weights of a\nneural network would be zero, no activation would be passed to the deeper\nlayers. So we start with random values sampled from a normal distribution.\n\n```python\nfrom layered.network import Matrices\n\nweights = Matrices(network.shapes)\nweights.flat = np.random.normal(0, weight_scale, len(weights.flat))\n```\n\n### Step 4: Optimization Algorithm\n\nNow let's learn good weights with standard backpropagation and gradient\ndescent.  The classes for this can be imported from the `gradient` and\n`optimization` modules. We also need a cost function.\n\n```python\nfrom layered.cost import SquaredError\nfrom layered.gradient import Backprop\nfrom layered.optimization import GradientDecent\n\nbackprop = Backprop(network, cost=SquaredError())\ndescent = GradientDecent()\n```\n\n### Step 5: Cost Functions\n\n| Function | Description | Definition | __________Graph__________ |\n| -------- | ----------- | :--------: | ------------------------- |\n| SquaredError | The most common cost function. The difference is squared to always be positive and penalize large errors stronger. | (pred - target) ^ 2 / 2 | ![Squared Error](image/squared-error.png) |\n| CrossEntropy | Logistic cost function useful for classification tasks. Commonly used in conjunction with Softmax output layers. | -((target * log(pred)) + (1 - target) * log(1 - pred)) | ![Cross Entropy](image/cross-entropy.png) |\n\n### Step 6: Dataset and Training\n\nDatasets are automatically downloaded and cached. We just iterate over the\ntraining examples and train the weights on them.\n\n```python\nfrom layered.dataset import Mnist\n\ndataset = Mnist()\nfor example in dataset.training:\n    gradient = backprop(weights, example)\n    weights = descent(weights, gradient, learning_rate=0.1)\n```\n\n### Step 7: Evaluation\n\nFinally, we want to see what our network has learned. We do this by letting the\nnetwork predict classes for the testing examples. The strongest class is the\nmodel's best bet, thus the `np.argmax`.\n\n```python\nimport numpy as np\n\nerror = 0\nfor example in dataset.testing:\n    prediction = network.feed(weights, example.data)\n    if np.argmax(prediction) != np.argmax(example.target):\n        error += 1 / len(dataset.testing)\nprint('Testing error', round(100 * error, 2), '%')\n```\n\nTroubleshooting\n---------------\n\n### Failed building wheel\n\nYou can safely ignore this messages during installation.\n\n### Python is not installed as a framework\n\nIf you get this error on Mac, don't create a virtualenv and install layered\nglobally with `sudo pip3 install layered`.\n\n### Crash at startup\n\nInstall or reinstall `python3-matplotlib` or equivalent using your package\nmanager. Check if matplotlib works outside of the virtualenv.\n\n```python\nimport matplotlib.pyplot as plt\nplt.plt([1, 2, 3, 4])\nplt.show()\n```\n\nEnsure you create your virtualenv with `--system-site-packages`.\n\n### Did you encounter another problem?\n\nPlease [open an issue][10].\n\n[10]: https://github.com/danijar/layered/issues\n"
  },
  {
    "path": "dataset/.gitignore",
    "content": "*\n!.gitignore\n"
  },
  {
    "path": "doc/conf.py",
    "content": "#!/usr/bin/env python3\n\nimport sys\nimport os\nfrom unittest.mock import MagicMock\n\n\nsys.path.insert(0, os.path.join(os.path.dirname(__file__), '..'))\nextensions = [\n    'sphinx.ext.autodoc',\n    'sphinx.ext.coverage',\n    'sphinx.ext.viewcode',\n]\n\nMOCK_MODULES = [\n    'yaml',\n    'numpy',\n    'matplotlib',\n    'matplotlib.pyplot',\n]\nfor mod_name in MOCK_MODULES:\n    sys.modules[mod_name] = MagicMock()\n\n################################################################\n# General\n################################################################\n\nproject = 'Layered'\ncopyright = '2015, Danijar Hafner'\nauthor = 'Danijar Hafner'\nversion = '0.1'\nrelease = '0.1.4'\nsource_suffix = '.rst'\nmaster_doc = 'index'\ntemplates_path = ['_templates']\nexclude_patterns = ['_build']\npygments_style = 'sphinx'\nadd_module_names = False\ntodo_include_todos = False\nlanguage = None\nhtmlhelp_basename = 'Layereddoc'\n\n################################################################\n# HTML\n################################################################\n\nhtml_domain_indices = False\nhtml_use_index = False\nhtml_show_sphinx = False\nhtml_show_copyright = False\n\n################################################################\n# Autodoc\n################################################################\n\nautoclass_content = 'class'\nautodoc_member_order = 'bysource'\nautodoc_default_flags = [\n    'members',\n    'undoc-members',\n    'inherited-members',\n    'show-inheritance',\n]\nautodoc_mock_imports = MOCK_MODULES\n\n\ndef autodoc_skip_member(app, what, name, obj, skip, options):\n    keep = ['call', 'iter', 'getitem', 'setitem']\n    if name.strip('_') in keep:\n        return False\n    return skip\n\n\ndef setup(app):\n    app.connect(\"autodoc-skip-member\", autodoc_skip_member)\n"
  },
  {
    "path": "doc/index.rst",
    "content": "Layered Documentation\n=====================\n\n.. toctree::\n\n   layered.activation\n   layered.cost\n   layered.dataset\n   layered.evaluation\n   layered.example\n   layered.gradient\n   layered.network\n   layered.optimization\n   layered.plot\n   layered.problem\n   layered.trainer\n   layered.utility\n"
  },
  {
    "path": "doc/layered.activation.rst",
    "content": "layered.activation module\n=========================\n\n.. automodule:: layered.activation\n    :members:\n    :undoc-members:\n    :show-inheritance:\n"
  },
  {
    "path": "doc/layered.cost.rst",
    "content": "layered.cost module\n===================\n\n.. automodule:: layered.cost\n    :members:\n    :undoc-members:\n    :show-inheritance:\n"
  },
  {
    "path": "doc/layered.dataset.rst",
    "content": "layered.dataset module\n======================\n\n.. automodule:: layered.dataset\n    :members:\n    :undoc-members:\n    :show-inheritance:\n"
  },
  {
    "path": "doc/layered.evaluation.rst",
    "content": "layered.evaluation module\n=========================\n\n.. automodule:: layered.evaluation\n    :members:\n    :undoc-members:\n    :show-inheritance:\n"
  },
  {
    "path": "doc/layered.example.rst",
    "content": "layered.example module\n======================\n\n.. automodule:: layered.example\n    :members:\n    :undoc-members:\n    :show-inheritance:\n"
  },
  {
    "path": "doc/layered.gradient.rst",
    "content": "layered.gradient module\n=======================\n\n.. automodule:: layered.gradient\n    :members:\n    :undoc-members:\n    :show-inheritance:\n"
  },
  {
    "path": "doc/layered.network.rst",
    "content": "layered.network module\n======================\n\n.. automodule:: layered.network\n    :members:\n    :undoc-members:\n    :show-inheritance:\n"
  },
  {
    "path": "doc/layered.optimization.rst",
    "content": "layered.optimization module\n===========================\n\n.. automodule:: layered.optimization\n    :members:\n    :undoc-members:\n    :show-inheritance:\n"
  },
  {
    "path": "doc/layered.plot.rst",
    "content": "layered.plot module\n===================\n\n.. automodule:: layered.plot\n    :members:\n    :undoc-members:\n    :show-inheritance:\n"
  },
  {
    "path": "doc/layered.problem.rst",
    "content": "layered.problem module\n======================\n\n.. automodule:: layered.problem\n    :members:\n    :undoc-members:\n    :show-inheritance:\n"
  },
  {
    "path": "doc/layered.trainer.rst",
    "content": "layered.trainer module\n======================\n\n.. automodule:: layered.trainer\n    :members:\n    :undoc-members:\n    :show-inheritance:\n"
  },
  {
    "path": "doc/layered.utility.rst",
    "content": "layered.utility module\n======================\n\n.. automodule:: layered.utility\n    :members:\n    :undoc-members:\n    :show-inheritance:\n"
  },
  {
    "path": "layered/__init__.py",
    "content": ""
  },
  {
    "path": "layered/__main__.py",
    "content": "import os\nimport argparse\nfrom layered.problem import Problem\nfrom layered.trainer import Trainer\n\n\ndef main():\n    parser = argparse.ArgumentParser('layered')\n    parser.add_argument(\n        'problem',\n        help='path to the YAML problem definition')\n    parser.add_argument(\n        '-v', '--visual', action='store_true',\n        help='show a diagram of training costs')\n    parser.add_argument(\n        '-l', '--load', default=None,\n        help='path to load the weights from at startup')\n    parser.add_argument(\n        '-s', '--save', default=None,\n        help='path to dump the learned weights at each evaluation')\n    parser.add_argument(\n        '-c', '--check', action='store_true',\n        help='whether to activate gradient checking')\n    args = parser.parse_args()\n\n    print('Problem', os.path.split(args.problem)[1])\n    problem = Problem(args.problem)\n    trainer = Trainer(\n        problem, args.load, args.save, args.visual, args.check)\n    trainer()\n\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "layered/activation.py",
    "content": "import numpy as np\n\n\nclass Activation:\n\n    def __call__(self, incoming):\n        raise NotImplementedError\n\n    def delta(self, incoming, outgoing, above):\n        \"\"\"\n        Compute the derivative of the cost with respect to the input of this\n        activation function. Outgoing is what this function returned in the\n        forward pass and above is the derivative of the cost with respect to\n        the outgoing activation.\n        \"\"\"\n        raise NotImplementedError\n\n\nclass Identity(Activation):\n\n    def __call__(self, incoming):\n        return incoming\n\n    def delta(self, incoming, outgoing, above):\n        delta = np.ones(incoming.shape).astype(float)\n        return delta * above\n\n\nclass Sigmoid(Activation):\n\n    def __call__(self, incoming):\n        return 1 / (1 + np.exp(-incoming))\n\n    def delta(self, incoming, outgoing, above):\n        delta = outgoing * (1 - outgoing)\n        return delta * above\n\n\nclass Relu(Activation):\n\n    def __call__(self, incoming):\n        return np.maximum(incoming, 0)\n\n    def delta(self, incoming, outgoing, above):\n        delta = np.greater(incoming, 0).astype(float)\n        return delta * above\n\n\nclass Softmax(Activation):\n\n    def __call__(self, incoming):\n        # The constant doesn't change the expression but prevents overflows.\n        constant = np.max(incoming)\n        exps = np.exp(incoming - constant)\n        return exps / exps.sum()\n\n    def delta(self, incoming, outgoing, above):\n        delta = outgoing * above\n        sum_ = delta.sum(axis=delta.ndim - 1, keepdims=True)\n        delta -= outgoing * sum_\n        return delta\n\n\nclass SparseField(Activation):\n\n    def __init__(self, inhibition=0.05, leaking=0.0):\n        self.inhibition = inhibition\n        self.leaking = leaking\n\n    def __call__(self, incoming):\n        count = len(incoming)\n        length = int(np.sqrt(count))\n        assert length ** 2 == count, 'layer size must be a square'\n        field = incoming.copy().reshape((length, length))\n        radius = int(np.sqrt(self.inhibition * count)) // 2\n        assert radius, 'no inhibition due to small factor'\n        outgoing = np.zeros(field.shape)\n        while True:\n            x, y = np.unravel_index(field.argmax(), field.shape)\n            if field[x, y] <= 0:\n                break\n            outgoing[x, y] = 1\n            surrounding = np.s_[\n                max(x - radius, 0):min(x + radius + 1, length),\n                max(y - radius, 0):min(y + radius + 1, length)]\n            field[surrounding] = 0\n            assert field[x, y] == 0\n        outgoing = outgoing.reshape(count)\n        outgoing = np.maximum(outgoing, self.leaking * incoming)\n        return outgoing\n\n    def delta(self, incoming, outgoing, above):\n        delta = np.greater(outgoing, 0).astype(float)\n        return delta * above\n\n\nclass SparseRange(Activation):\n    \"\"\"\n    E%-Max Winner-Take-All.\n\n    Binary activation. First, the activation function is applied. Then all\n    neurons within the specified range below the strongest neuron are set to\n    one. All others are set to zero. The gradient is the one of the activation\n    function for active neurons and zero otherwise.\n\n    See: A Second Function of Gamma Frequency Oscillations: An E%-Max\n    Winner-Take-All Mechanism Selects Which Cells Fire. (2009)\n    \"\"\"\n\n    def __init__(self, range_=0.3, function=Sigmoid()):\n        assert 0 < range_ < 1\n        self._range = range_\n        self._function = function\n\n    def __call__(self, incoming):\n        incoming = self._function(incoming)\n        threshold = self._threshold(incoming)\n        active = (incoming >= threshold)\n        outgoing = np.zeros(incoming.shape)\n        outgoing[active] = 1\n        # width = active.sum() * 80 / 1000\n        # print('|', '#' * width, ' ' * (80 - width), '|')\n        return outgoing\n\n    def delta(self, incoming, outgoing, above):\n        # return self._function.delta(incoming, outgoing, outgoing * above)\n        return outgoing * self._function.delta(incoming, outgoing, above)\n\n    def _threshold(self, incoming):\n        min_, max_ = incoming.min(), incoming.max()\n        threshold = min_ + (max_ - min_) * (1 - self._range)\n        return threshold\n"
  },
  {
    "path": "layered/cost.py",
    "content": "import numpy as np\n\n\nclass Cost:\n\n    def __call__(self, prediction, target):\n        raise NotImplementedError\n\n    def delta(self, prediction, target):\n        raise NotImplementedError\n\n\nclass SquaredError(Cost):\n    \"\"\"\n    Fast and simple cost function.\n    \"\"\"\n\n    def __call__(self, prediction, target):\n        return (prediction - target) ** 2 / 2\n\n    def delta(self, prediction, target):\n        return prediction - target\n\n\nclass CrossEntropy(Cost):\n    \"\"\"\n    Logistic cost function used for classification tasks. Learns faster in the\n    beginning than SquaredError because large errors are penalized\n    exponentially. This makes sense in classification since only the best class\n    will be the predicted one.\n    \"\"\"\n\n    def __init__(self, epsilon=1e-11):\n        self.epsilon = epsilon\n\n    def __call__(self, prediction, target):\n        clipped = np.clip(prediction, self.epsilon, 1 - self.epsilon)\n        cost = target * np.log(clipped) + (1 - target) * np.log(1 - clipped)\n        return -cost\n\n    def delta(self, prediction, target):\n        denominator = np.maximum(prediction - prediction ** 2, self.epsilon)\n        delta = (prediction - target) / denominator\n        assert delta.shape == target.shape == prediction.shape\n        return delta\n"
  },
  {
    "path": "layered/dataset.py",
    "content": "import array\nimport os\nimport shutil\nimport struct\nimport gzip\nfrom urllib.request import urlopen\nimport numpy as np\nfrom layered.example import Example\nfrom layered.utility import ensure_folder\n\n\nclass Dataset:\n\n    urls = []\n    cache = True\n\n    def __init__(self):\n        cache = type(self).cache\n        if cache and self._is_cached():\n            print('Load cached dataset')\n            self.load()\n        else:\n            filenames = [self.download(x) for x in type(self).urls]\n            self.training, self.testing = self.parse(*filenames)\n            if cache:\n                self.dump()\n\n    @classmethod\n    def folder(cls):\n        name = cls.__name__.lower()\n        home = os.path.expanduser('~')\n        folder = os.path.join(home, '.layered/dataset', name)\n        ensure_folder(folder)\n        return folder\n\n    def parse(self):\n        \"\"\"\n        Subclass responsibility. The filenames of downloaded files will be\n        passed as individual parameters to this function. Therefore, it must\n        accept as many parameters as provided class-site urls. Should return a\n        tuple of training examples and testing examples.\n        \"\"\"\n        raise NotImplementedError\n\n    def dump(self):\n        np.save(self._training_path(), self.training)\n        np.save(self._testing_path(), self.testing)\n\n    def load(self):\n        self.training = np.load(self._training_path())\n        self.testing = np.load(self._testing_path())\n\n    def download(self, url):\n        _, filename = os.path.split(url)\n        filename = os.path.join(self.folder(), filename)\n        print('Download', filename)\n        with urlopen(url) as response, open(filename, 'wb') as file_:\n            shutil.copyfileobj(response, file_)\n        return filename\n\n    @staticmethod\n    def split(examples, ratio=0.8):\n        \"\"\"\n        Utility function that can be used within the parse() implementation of\n        sub classes to split a list of example into two lists for training and\n        testing.\n        \"\"\"\n        split = int(ratio * len(examples))\n        return examples[:split], examples[split:]\n\n    def _is_cached(self):\n        if not os.path.exists(self._training_path()):\n            return False\n        if not os.path.exists(self._testing_path()):\n            return False\n        return True\n\n    def _training_path(self):\n        return os.path.join(self.folder(), 'training.npy')\n\n    def _testing_path(self):\n        return os.path.join(self.folder(), 'testing.npy')\n\n\nclass Test(Dataset):\n\n    cache = False\n\n    def __init__(self, amount=10):\n        self.amount = amount\n        super().__init__()\n\n    def parse(self):\n        examples = [Example([1, 2, 3], [1, 2, 3]) for _ in range(self.amount)]\n        return self.split(examples)\n\n\nclass Regression(Dataset):\n    \"\"\"\n    Synthetically generated dataset for regression. The task is to predict the\n    sum and product of all the input values. All values are normalized between\n    zero and one.\n    \"\"\"\n\n    cache = False\n\n    def __init__(self, amount=10000, inputs=10):\n        self.amount = amount\n        self.inputs = inputs\n        super().__init__()\n\n    def parse(self):\n        data = np.random.rand(self.amount, self.inputs)\n        products = np.prod(data, axis=1)\n        products = products / np.max(products)\n        sums = np.sum(data, axis=1)\n        sums = sums / np.max(sums)\n        targets = np.column_stack([sums, products])\n        examples = [Example(x, y) for x, y in zip(data, targets)]\n        return self.split(examples)\n\n\nclass Modulo(Dataset):\n    \"\"\"\n    Sythetically generated classification dataset. The task is to predict the\n    modulo classes of random integers encoded as bit arrays of length 32.\n    \"\"\"\n\n    cache = False\n\n    def __init__(self, amount=60000, inputs=32, classes=7):\n        self.amount = amount\n        self.inputs = inputs\n        self.classes = classes\n        super().__init__()\n\n    def parse(self):\n        data = np.random.randint(0, self.inputs ** 2 - 1, self.amount)\n        mods = np.mod(data, self.classes)\n        targets = np.zeros((self.amount, self.classes))\n        for index, mod in enumerate(mods):\n            targets[index][mod] = 1\n        data = (((data[:, None] & (1 << np.arange(self.inputs)))) > 0)\n        examples = [Example(x, y) for x, y in zip(data, targets)]\n        return self.split(examples)\n\n\nclass Mnist(Dataset):\n    \"\"\"\n    The MNIST database of handwritten digits, available from this page, has a\n    training set of 60,000 examples, and a test set of 10,000 examples. It is a\n    subset of a larger set available from NIST. The digits have been\n    size-normalized and centered in a fixed-size image. It is a good database\n    for people who want to try learning techniques and pattern recognition\n    methods on real-world data while spending minimal efforts on preprocessing\n    and formatting. (from http://yann.lecun.com/exdb/mnist/)\n    \"\"\"\n\n    urls = [\n        'http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz',\n        'http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz',\n        'http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz',\n        'http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz',\n    ]\n\n    def parse(self, train_x, train_y, test_x, test_y):\n        # pylint: disable=arguments-differ\n        training = list(self.read(train_x, train_y))\n        testing = list(self.read(test_x, test_y))\n        return training, testing\n\n    @staticmethod\n    def read(data, labels):\n        images = gzip.open(data, 'rb')\n        _, size, rows, cols = struct.unpack('>IIII', images.read(16))\n        image_bin = array.array('B', images.read())\n        images.close()\n\n        labels = gzip.open(labels, 'rb')\n        _, size2 = struct.unpack('>II', labels.read(8))\n        assert size == size2\n        label_bin = array.array('B', labels.read())\n        labels.close()\n\n        for i in range(size):\n            data = image_bin[i * rows * cols:(i + 1) * rows * cols]\n            data = np.array(data).reshape(rows * cols) / 255\n            target = np.zeros(10)\n            target[label_bin[i]] = 1\n            yield Example(data, target)\n"
  },
  {
    "path": "layered/evaluation.py",
    "content": "import numpy as np\n\n\ndef compute_costs(network, weights, cost, examples):\n    prediction = [network.feed(weights, x.data) for x in examples]\n    costs = [cost(x, y.target).mean() for x, y in zip(prediction, examples)]\n    return costs\n\n\ndef compute_error(network, weights, examples):\n    prediction = [network.feed(weights, x.data) for x in examples]\n    error = sum(bool(np.argmax(x) != np.argmax(y.target)) for x, y in\n                zip(prediction, examples)) / len(examples)\n    return error\n"
  },
  {
    "path": "layered/example.py",
    "content": "import numpy as np\n\n\nclass Example:\n    \"\"\"\n    Immutable class representing one example in a dataset.\n    \"\"\"\n\n    __slots__ = ('_data', '_target')\n\n    def __init__(self, data, target):\n        self._data = np.array(data, dtype=float)\n        self._target = np.array(target, dtype=float)\n\n    @property\n    def data(self):\n        return self._data\n\n    @property\n    def target(self):\n        return self._target\n\n    def __getstate__(self):\n        return {'data': self.data, 'target': self.target}\n\n    def __setstate__(self, state):\n        self._data = state['data']\n        self._target = state['target']\n\n    def __repr__(self):\n        data = ' '.join(str(round(x, 2)) for x in self.data)\n        target = ' '.join(str(round(x, 2)) for x in self.target)\n        return '({})->({})'.format(data, target)\n"
  },
  {
    "path": "layered/gradient.py",
    "content": "import math\nimport functools\nimport multiprocessing\nimport numpy as np\nfrom layered.network import Matrices\nfrom layered.utility import batched\n\n\nclass Gradient:\n\n    def __init__(self, network, cost):\n        self.network = network\n        self.cost = cost\n\n    def __call__(self, weights, example):\n        raise NotImplementedError\n\n\nclass Backprop(Gradient):\n    \"\"\"\n    Use the backpropagation algorithm to efficiently determine the gradient of\n    the cost function with respect to each individual weight.\n    \"\"\"\n\n    def __call__(self, weights, example):\n        prediction = self.network.feed(weights, example.data)\n        delta_output = self._delta_output(prediction, example.target)\n        delta_layers = self._delta_layers(weights, delta_output)\n        delta_weights = self._delta_weights(delta_layers)\n        return delta_weights\n\n    def _delta_output(self, prediction, target):\n        assert len(target) == self.network.layers[-1].size\n        # The derivative with respect to the output layer is computed as the\n        # product of error derivative and local derivative at the layer.\n        delta_cost = self.cost.delta(prediction, target)\n        delta_output = self.network.layers[-1].delta(delta_cost)\n        assert len(delta_cost) == len(delta_output) == len(target)\n        return delta_output\n\n    def _delta_layers(self, weights, delta_output):\n        # Propagate backwards through the hidden layers but not the input\n        # layer. The current weight matrix is the one to the right of the\n        # current layer.\n        gradient = [delta_output]\n        hidden = list(zip(weights[1:], self.network.layers[1:-1]))\n        assert all(x.shape[0] - 1 == len(y) for x, y in hidden)\n        for weight, layer in reversed(hidden):\n            delta = self._delta_layer(layer, weight, gradient[-1])\n            gradient.append(delta)\n        return reversed(gradient)\n\n    def _delta_layer(self, layer, weight, above):\n        # The gradient at a layer is computed as the derivative of both the\n        # local activation and the weighted sum of the derivatives in the\n        # deeper layer.\n        backward = self.network.backward(weight, above)\n        delta = layer.delta(backward)\n        assert len(layer) == len(backward) == len(delta)\n        return delta\n\n    def _delta_weights(self, delta_layers):\n        # The gradient with respect to the weights is computed as the gradient\n        # at the target neuron multiplied by the activation of the source\n        # neuron.\n        gradient = Matrices(self.network.shapes)\n        prev_and_delta = zip(self.network.layers[:-1], delta_layers)\n        for index, (previous, delta) in enumerate(prev_and_delta):\n            # We want to tweak the bias weights so we need them in the\n            # gradient.\n            activations = np.insert(previous.outgoing, 0, 1)\n            assert activations[0] == 1\n            gradient[index] = np.outer(activations, delta)\n        return gradient\n\n\nclass NumericalGradient(Gradient):\n    \"\"\"\n    Approximate the gradient for each weight individually by sampling the error\n    function slightly above and below the current value of the weight.\n    \"\"\"\n\n    def __init__(self, network, cost, distance=1e-5):\n        super().__init__(network, cost)\n        self.distance = distance\n\n    def __call__(self, weights, example):\n        \"\"\"\n        Modify each weight individually in both directions to calculate a\n        numeric gradient of the weights.\n        \"\"\"\n        # We need a copy of the weights that we can modify to evaluate the cost\n        # function on.\n        modified = Matrices(weights.shapes, weights.flat.copy())\n        gradient = Matrices(weights.shapes)\n        for i, connection in enumerate(weights):\n            for j, original in np.ndenumerate(connection):\n                # Sample above and below and compute costs.\n                modified[i][j] = original + self.distance\n                above = self._evaluate(modified, example)\n                modified[i][j] = original - self.distance\n                below = self._evaluate(modified, example)\n                # Restore the original value so we can reuse the weight matrix\n                # for the next iteration.\n                modified[i][j] = original\n                # Compute the numeric gradient.\n                sample = (above - below) / (2 * self.distance)\n                gradient[i][j] = sample\n        return gradient\n\n    def _evaluate(self, weights, example):\n        prediction = self.network.feed(weights, example.data)\n        cost = self.cost(prediction, example.target)\n        assert cost.shape == prediction.shape\n        return cost.sum()\n\n\nclass CheckedBackprop(Gradient):\n    \"\"\"\n    Computes the gradient both analytically trough backpropagation and\n    numerically to validate the backpropagation implementation and derivatives\n    of activation functions and cost functions. This is slow by its nature and\n    it's recommended to validate derivatives on small networks.\n    \"\"\"\n\n    def __init__(self, network, cost, distance=1e-5, tolerance=1e-8):\n        self.tolerance = tolerance\n        super().__init__(network, cost)\n        self.analytic = Backprop(network, cost)\n        self.numeric = NumericalGradient(network, cost, distance)\n\n    def __call__(self, weights, example):\n        analytic = self.analytic(weights, example)\n        numeric = self.numeric(weights, example)\n        distances = np.absolute(analytic.flat - numeric.flat)\n        worst = distances.max()\n        if worst > self.tolerance:\n            print('Gradient differs by {:.2f}%'.format(100 * worst))\n        else:\n            print('Gradient looks good')\n        return analytic\n\n\nclass BatchBackprop:\n    \"\"\"\n    Calculate the average gradient over a batch of examples.\n    \"\"\"\n\n    def __init__(self, network, cost):\n        self.backprop = Backprop(network, cost)\n\n    def __call__(self, weights, examples):\n        gradient = Matrices(weights.shapes)\n        for example in examples:\n            gradient += self.backprop(weights, example)\n        return gradient / len(examples)\n\n\nclass ParallelBackprop:\n    \"\"\"\n    Alternative to BatchBackprop that yields the same results but utilizes\n    multiprocessing to make use of more than one processor core.\n    \"\"\"\n\n    def __init__(self, network, cost, workers=4):\n        self.backprop = BatchBackprop(network, cost)\n        self.workers = workers\n        self.pool = multiprocessing.Pool(self.workers)\n\n    def __call__(self, weights, examples):\n        batch_size = int(math.ceil(len(examples) / self.workers))\n        batches = list(batched(examples, batch_size))\n        sizes = [len(x) / batch_size for x in batches]\n        sizes = [x / sum(sizes) for x in sizes]\n        assert len(batches) <= self.workers\n        assert sum(sizes) == 1\n        compute = functools.partial(self.backprop, weights)\n        gradients = self.pool.map(compute, batches)\n        return sum(x * y for x, y in zip(gradients, sizes))\n"
  },
  {
    "path": "layered/network.py",
    "content": "import operator\nimport numpy as np\n\n\nclass Layer:\n\n    def __init__(self, size, activation):\n        assert size and isinstance(size, int)\n        self.size = size\n        self.activation = activation()\n        self.incoming = np.zeros(size)\n        self.outgoing = np.zeros(size)\n        assert len(self.incoming) == len(self.outgoing) == self.size\n\n    def __len__(self):\n        assert len(self.incoming) == len(self.outgoing)\n        return len(self.incoming)\n\n    def __repr__(self):\n        return repr(self.outgoing)\n\n    def __str__(self):\n        table = zip(self.incoming, self.outgoing)\n        rows = [' /'.join('{: >6.3f}'.format(x) for x in row) for row in table]\n        return '\\n'.join(rows)\n\n    def apply(self, incoming):\n        \"\"\"\n        Store the incoming activation, apply the activation function and store\n        the result as outgoing activation.\n        \"\"\"\n        assert len(incoming) == self.size\n        self.incoming = incoming\n        outgoing = self.activation(self.incoming)\n        assert len(outgoing) == self.size\n        self.outgoing = outgoing\n\n    def delta(self, above):\n        \"\"\"\n        The derivative of the activation function at the current state.\n        \"\"\"\n        return self.activation.delta(self.incoming, self.outgoing, above)\n\n\nclass Matrices:\n\n    def __init__(self, shapes, elements=None):\n        self.shapes = shapes\n        length = sum(x * y for x, y in shapes)\n        if elements is not None:\n            assert len(elements) == length\n            elements = elements.copy()\n        else:\n            elements = np.zeros(length)\n        self.flat = elements\n\n    def __len__(self):\n        return len(self.shapes)\n\n    def __getitem__(self, index):\n        if hasattr(index, '__len__'):\n            assert isinstance(index[0], int)\n            return self[index[0]][index[1:]]\n        if isinstance(index, slice):\n            return [self[i] for i in self._range_from_slice(index)]\n        slice_ = self._locate(index)\n        data = self.flat[slice_]\n        data = data.reshape(self.shapes[index])\n        return data\n\n    def __setitem__(self, index, data):\n        if hasattr(index, '__len__'):\n            assert isinstance(index[0], int)\n            self[index[0]][index[1:]] = data\n            return\n        if isinstance(index, slice):\n            for i in self._range_from_slice(index):\n                self[i] = data\n            return\n        slice_ = self._locate(index)\n        data = data.reshape(slice_.stop - slice_.start)\n        self.flat[slice_] = data\n\n    def __getattr__(self, name):\n        # Tunnel not found properties to the underlying array.\n        flat = super().__getattribute__('flat')\n        return getattr(flat, name)\n\n    def __setattr_(self, name, value):\n        # Ensure that the size of the underlying array doesn't change.\n        if name == 'flat':\n            assert value.shape == self.flat.shape\n        super().__setattr__(name, value)\n\n    def copy(self):\n        return Matrices(self.shapes, self.flat.copy())\n\n    def __add__(self, other):\n        return self._operation(other, lambda x, y: x + y)\n\n    def __sub__(self, other):\n        return self._operation(other, lambda x, y: x - y)\n\n    def __mul__(self, other):\n        return self._operation(other, lambda x, y: x * y)\n\n    def __truediv__(self, other):\n        return self._operation(other, lambda x, y: x / y)\n\n    __rmul__ = __mul__\n\n    __radd__ = __add__\n\n    def _operation(self, other, operation):\n        try:\n            other = other.flat\n        except AttributeError:\n            pass\n        return Matrices(self.shapes, operation(self.flat, other))\n\n    def _locate(self, index):\n        assert isinstance(index, int), (\n            'Only single elemente can be indiced in the first dimension.')\n        if index < 0:\n            index = len(self.shapes) + index\n        if not 0 <= index < len(self.shapes):\n            raise IndexError\n        offset = sum(x * y for x, y in self.shapes[:index])\n        length = operator.mul(*self.shapes[index])\n        return slice(offset, offset + length)\n\n    def _range_from_slice(self, slice_):\n        start = slice_.start if slice_.start else 0\n        stop = slice_.stop if slice_.stop else len(self.shapes)\n        step = slice_.step if slice_.step else 1\n        return range(start, stop, step)\n\n    def __str__(self):\n        return str(len(self.flat)) + str(self.flat)\n\n\nclass Network:\n\n    def __init__(self, layers):\n        self.layers = layers\n        self.sizes = tuple(layer.size for layer in self.layers)\n        # Weight matrices have the dimensions of the two layers around them.\n        # Also, there is an additional bias input to each weight matrix.\n        self.shapes = zip(self.sizes[:-1], self.sizes[1:])\n        self.shapes = [(x + 1, y) for x, y in self.shapes]\n        # Weight matrices are in between the layers.\n        assert len(self.shapes) == len(self.layers) - 1\n\n    def feed(self, weights, data):\n        \"\"\"\n        Evaluate the network with alternative weights on the input data and\n        return the output activation.\n        \"\"\"\n        assert len(data) == self.layers[0].size\n        self.layers[0].apply(data)\n        # Propagate trough the remaining layers.\n        connections = zip(self.layers[:-1], weights, self.layers[1:])\n        for previous, weight, current in connections:\n            incoming = self.forward(weight, previous.outgoing)\n            current.apply(incoming)\n        # Return the activations of the output layer.\n        return self.layers[-1].outgoing\n\n    @staticmethod\n    def forward(weight, activations):\n        # Add bias input of one.\n        activations = np.insert(activations, 0, 1)\n        assert activations[0] == 1\n        right = activations.dot(weight)\n        return right\n\n    @staticmethod\n    def backward(weight, activations):\n        left = activations.dot(weight.transpose())\n        # Don't expose the bias input of one.\n        left = left[1:]\n        return left\n"
  },
  {
    "path": "layered/optimization.py",
    "content": "class GradientDecent:\n    \"\"\"\n    Adapt the weights in the opposite direction of the gradient to reduce the\n    error.\n    \"\"\"\n\n    def __call__(self, weights, gradient, learning_rate=0.1):\n        return weights - learning_rate * gradient\n\n\nclass Momentum:\n    \"\"\"\n    Slow down changes of direction in the gradient by aggregating previous\n    values of the gradient and multiplying them in.\n    \"\"\"\n\n    def __init__(self):\n        self.previous = None\n\n    def __call__(self, gradient, rate=0.9):\n        gradient = gradient.copy()\n        if self.previous is None:\n            self.previous = gradient.copy()\n        else:\n            assert self.previous.shape == gradient.shape\n            gradient += rate * self.previous\n            self.previous = gradient.copy()\n        return gradient\n\n\nclass WeightDecay:\n    \"\"\"\n    Slowly moves each weight closer to zero for regularization. This can help\n    the model to find simpler solutions.\n    \"\"\"\n\n    def __call__(self, weights, rate=1e-4):\n        return (1 - rate) * weights\n\n\nclass WeightTying:\n    \"\"\"\n    Constraint groups of slices of the gradient to have the same value by\n    averaging them. Should be applied to the initial weights and each gradient.\n    \"\"\"\n\n    def __init__(self, *groups):\n        for group in groups:\n            assert group and hasattr(group, '__len__')\n            assert all([isinstance(x[0], int) for x in group])\n            assert all([isinstance(y, (slice, int)) for x in group for y in x])\n        self.groups = groups\n\n    def __call__(self, matrices):\n        matrices = matrices.copy()\n        for group in self.groups:\n            slices = [matrices[slice_] for slice_ in group]\n            assert all([x.shape == slices[0].shape for x in slices]), (\n                'All slices within a group must have the same shape. '\n                'Shapes are ' + ', '.join(str(x.shape) for x in slices) + '.')\n            average = sum(slices) / len(slices)\n            assert average.shape == slices[0].shape\n            for slice_ in group:\n                matrices[slice_] = average\n        return matrices\n"
  },
  {
    "path": "layered/plot.py",
    "content": "# pylint: disable=wrong-import-position\nimport collections\nimport time\nimport warnings\nimport inspect\nimport threading\nimport matplotlib\n\n# Don't call the code if Sphinx inspects the file mocking external imports.\nif inspect.ismodule(matplotlib):  # noqa\n    # On Mac force backend that works with threading.\n    if matplotlib.get_backend() == 'MacOSX':\n        matplotlib.use('TkAgg')\n    # Hide matplotlib deprecation message.\n    warnings.filterwarnings('ignore', category=matplotlib.cbook.mplDeprecation)\n    # Ensure available interactive backend.\n    if matplotlib.get_backend() not in matplotlib.rcsetup.interactive_bk:\n        print('No visual backend available. Maybe you are inside a virtualenv '\n              'that was created without --system-site-packages.')\n\nimport matplotlib.pyplot as plt\n\n\nclass Interface:\n\n    def __init__(self, title='', xlabel='', ylabel='', style=None):\n        self._style = style or {}\n        self._title = title\n        self._xlabel = xlabel\n        self._ylabel = ylabel\n        self.xdata = []\n        self.ydata = []\n        self.width = 0\n        self.height = 0\n\n    @property\n    def style(self):\n        return self._style\n\n    @property\n    def title(self):\n        return self._title\n\n    @property\n    def xlabel(self):\n        return self._xlabel\n\n    @property\n    def ylabel(self):\n        return self._ylabel\n\n\nclass State:\n\n    def __init__(self):\n        self.running = False\n\n\nclass Window:\n\n    def __init__(self, refresh=0.5):\n        self.refresh = refresh\n        self.thread = None\n        self.state = State()\n        self.figure = plt.figure()\n        self.interfaces = []\n        plt.ion()\n        plt.show()\n\n    def register(self, position, interface):\n        axis = self.figure.add_subplot(\n            position, title=interface.title,\n            xlabel=interface.xlabel, ylabel=interface.ylabel)\n        axis.get_xaxis().set_ticks([])\n        line, = axis.plot(interface.xdata, interface.ydata, **interface.style)\n        self.interfaces.append((axis, line, interface))\n\n    def start(self, work):\n        \"\"\"\n        Hand the main thread to the window and continue work in the provided\n        function. A state is passed as the first argument that contains a\n        `running` flag. The function is expected to exit if the flag becomes\n        false. The flag can also be set to false to stop the window event loop\n        and continue in the main thread after the `start()` call.\n        \"\"\"\n        assert threading.current_thread() == threading.main_thread()\n        assert not self.state.running\n        self.state.running = True\n        self.thread = threading.Thread(target=work, args=(self.state,))\n        self.thread.start()\n        while self.state.running:\n            try:\n                before = time.time()\n                self.update()\n                duration = time.time() - before\n                plt.pause(max(0.001, self.refresh - duration))\n            except KeyboardInterrupt:\n                self.state.running = False\n                self.thread.join()\n                return\n\n    def stop(self):\n        \"\"\"\n        Close the window and stops the worker thread. The main thread will\n        resume with the next command after the `start()` call.\n        \"\"\"\n        assert threading.current_thread() == self.thread\n        assert self.state.running\n        self.state.running = False\n\n    def update(self):\n        \"\"\"\n        Redraw the figure to show changed data. This is automatically called\n        after `start()` was run.\n        \"\"\"\n        assert threading.current_thread() == threading.main_thread()\n        for axis, line, interface in self.interfaces:\n            line.set_xdata(interface.xdata)\n            line.set_ydata(interface.ydata)\n            axis.set_xlim(0, interface.width or 1, emit=False)\n            axis.set_ylim(0, interface.height or 1, emit=False)\n        self.figure.canvas.draw()\n\n\nclass Plot(Interface):\n\n    def __init__(self, title, xlabel, ylabel, style=None, fixed=None):\n        # pylint: disable=too-many-arguments, redefined-variable-type\n        super().__init__(title, xlabel, ylabel, style or {})\n        self.max_ = 0\n        if not fixed:\n            self.xdata = []\n            self.ydata = []\n        else:\n            self.xdata = list(range(fixed))\n            self.ydata = collections.deque([None] * fixed, maxlen=fixed)\n            self.width = fixed\n\n    def __call__(self, values):\n        self.ydata += values\n        self.max_ = max(self.max_, *values)\n        self.height = 1.05 * self.max_\n        while len(self.xdata) < len(self.ydata):\n            self.xdata.append(len(self.xdata))\n        self.width = len(self.xdata) - 1\n        assert len(self.xdata) == len(self.ydata)\n"
  },
  {
    "path": "layered/problem.py",
    "content": "import os\nimport yaml\nimport layered.cost\nimport layered.dataset\nimport layered.activation\nfrom layered.network import Layer\n\n\nclass Problem:\n\n    def __init__(self, content=None):\n        \"\"\"\n        Construct a problem. If content is specified, try to load it as a YAML\n        path and otherwise treat it as an inline YAML string.\n        \"\"\"\n        if content and os.path.isfile(content):\n            with open(content) as file_:\n                self.parse(file_)\n        elif content:\n            self.parse(content)\n        self._validate()\n\n    def __str__(self):\n        keys = self.__dict__.keys() & self._defaults().keys()\n        return str({x: getattr(self, x) for x in keys})\n\n    def parse(self, definition):\n        definition = yaml.load(definition)\n        self._load_definition(definition)\n        self._load_symbols()\n        self._load_layers()\n        self._load_weight_tying()\n        assert not definition, (\n            'unknown properties {} in problem definition'\n            .format(', '.join(definition.keys())))\n\n    def _load_definition(self, definition):\n        # The empty dictionary causes defaults to be loaded even if the\n        # definition is None.\n        if not definition:\n            definition = {}\n        for name, default in self._defaults().items():\n            type_ = type(default)\n            self.__dict__[name] = type_(definition.pop(name, default))\n\n    def _load_symbols(self):\n        # pylint: disable=attribute-defined-outside-init\n        self.cost = self._find_symbol(layered.cost, self.cost)()\n        self.dataset = self._find_symbol(layered.dataset, self.dataset)()\n\n    def _load_layers(self):\n        for index, layer in enumerate(self.layers):\n            size, activation = layer.pop('size'), layer.pop('activation')\n            activation = self._find_symbol(layered.activation, activation)\n            self.layers[index] = Layer(size, activation)\n\n    def _load_weight_tying(self):\n        # pylint: disable=attribute-defined-outside-init\n        self.weight_tying = [[y.split(',') for y in x]\n                             for x in self.weight_tying]\n        for i, group in enumerate(self.weight_tying):\n            for j, slices in enumerate(group):\n                for k, slice_ in enumerate(slices):\n                    slice_ = [int(s) if s else None for s in slice_.split(':')]\n                    self.weight_tying[i][j][k] = slice(*slice_)\n        for i, group in enumerate(self.weight_tying):\n            for j, slices in enumerate(group):\n                assert not slices[0].start and not slices[0].step, (\n                    'Ranges are not allowed in the first dimension.')\n                self.weight_tying[i][j][0] = slices[0].stop\n\n    def _find_symbol(self, module, name, fallback=None):\n        \"\"\"\n        Find the symbol of the specified name inside the module or raise an\n        exception.\n        \"\"\"\n        if not hasattr(module, name) and fallback:\n            return self._find_symbol(module, fallback, None)\n        return getattr(module, name)\n\n    def _validate(self):\n        num_input = len(self.dataset.training[0].data)\n        num_output = len(self.dataset.training[0].target)\n        if self.layers:\n            assert self.layers[0].size == num_input, (\n                'the size of the input layer must match the training data')\n            assert self.layers[-1].size == num_output, (\n                'the size of the output layer must match the training labels')\n\n    @staticmethod\n    def _defaults():\n        return {\n            'cost': 'SquaredError',\n            'dataset': 'Modulo',\n            'layers': [],\n            'epochs': 1,\n            'batch_size': 1,\n            'learning_rate': 0.1,\n            'momentum': 0.0,\n            'weight_scale': 0.1,\n            'weight_mean': 0.0,\n            'weight_decay': 0.0,\n            'weight_tying': [],\n            'evaluate_every': 1000,\n        }\n"
  },
  {
    "path": "layered/trainer.py",
    "content": "import functools\nimport numpy as np\nfrom layered.gradient import BatchBackprop, CheckedBackprop\nfrom layered.network import Network, Matrices\nfrom layered.optimization import (\n    GradientDecent, Momentum, WeightDecay, WeightTying)\nfrom layered.utility import repeated, batched\nfrom layered.evaluation import compute_costs, compute_error\n\n\nclass Trainer:\n    # pylint: disable=attribute-defined-outside-init, too-many-arguments\n\n    def __init__(self, problem, load=None, save=None,\n                 visual=False, check=False):\n        self.problem = problem\n        self.load = load\n        self.save = save\n        self.visual = visual\n        self.check = check\n        self._init_network()\n        self._init_training()\n        self._init_visualize()\n\n    def _init_network(self):\n        \"\"\"Define model and initialize weights.\"\"\"\n        self.network = Network(self.problem.layers)\n        self.weights = Matrices(self.network.shapes)\n        if self.load:\n            loaded = np.load(self.load)\n            assert loaded.shape == self.weights.shape, (\n                'weights to load must match problem definition')\n            self.weights.flat = loaded\n        else:\n            self.weights.flat = np.random.normal(\n                self.problem.weight_mean, self.problem.weight_scale,\n                len(self.weights.flat))\n\n    def _init_training(self):\n        # pylint: disable=redefined-variable-type\n        \"\"\"Classes needed during training.\"\"\"\n        if self.check:\n            self.backprop = CheckedBackprop(self.network, self.problem.cost)\n        else:\n            self.backprop = BatchBackprop(self.network, self.problem.cost)\n        self.momentum = Momentum()\n        self.decent = GradientDecent()\n        self.decay = WeightDecay()\n        self.tying = WeightTying(*self.problem.weight_tying)\n        self.weights = self.tying(self.weights)\n\n    def _init_visualize(self):\n        if not self.visual:\n            return\n        from layered.plot import Window, Plot\n        self.plot_training = Plot(\n            'Training', 'Examples', 'Cost', fixed=1000,\n            style={'linestyle': '', 'marker': '.'})\n        self.plot_testing = Plot('Testing', 'Time', 'Error')\n        self.window = Window()\n        self.window.register(211, self.plot_training)\n        self.window.register(212, self.plot_testing)\n\n    def __call__(self):\n        \"\"\"Train the model and visualize progress.\"\"\"\n        print('Start training')\n        repeats = repeated(self.problem.dataset.training, self.problem.epochs)\n        batches = batched(repeats, self.problem.batch_size)\n        if self.visual:\n            self.window.start(functools.partial(self._train_visual, batches))\n        else:\n            self._train(batches)\n\n    def _train(self, batches):\n        for index, batch in enumerate(batches):\n            try:\n                self._batch(index, batch)\n            except KeyboardInterrupt:\n                print('\\nAborted')\n                return\n        print('Done')\n\n    def _train_visual(self, batches, state):\n        for index, batch in enumerate(batches):\n            if not state.running:\n                print('\\nAborted')\n                return\n            self._batch(index, batch)\n        print('Done')\n        input('Press any key to close window')\n        state.running = False\n\n    def _batch(self, index, batch):\n        if self.check:\n            assert len(batch) == 1\n            gradient = self.backprop(self.weights, batch[0])\n        else:\n            gradient = self.backprop(self.weights, batch)\n        gradient = self.momentum(gradient, self.problem.momentum)\n        gradient = self.tying(gradient)\n        self.weights = self.decent(\n            self.weights, gradient, self.problem.learning_rate)\n        self.weights = self.decay(self.weights, self.problem.weight_decay)\n        self._visualize(batch)\n        self._evaluate(index)\n\n    def _visualize(self, batch):\n        if not self.visual:\n            return\n        costs = compute_costs(\n            self.network, self.weights, self.problem.cost, batch)\n        self.plot_training(costs)\n\n    def _evaluate(self, index):\n        if not self._every(self.problem.evaluate_every,\n                           self.problem.batch_size, index):\n            return\n        if self.save:\n            np.save(self.save, self.weights)\n        error = compute_error(\n            self.network, self.weights, self.problem.dataset.testing)\n        print('Batch {} test error {:.2f}%'.format(index, 100 * error))\n        if self.visual:\n            self.plot_testing([error])\n\n    @staticmethod\n    def _every(times, step_size, index):\n        \"\"\"\n        Given a loop over batches of an iterable and an operation that should\n        be performed every few elements. Determine whether the operation should\n        be called for the current index.\n        \"\"\"\n        current = index * step_size\n        step = current // times * times\n        reached = current >= step\n        overshot = current >= step + step_size\n        return current and reached and not overshot\n"
  },
  {
    "path": "layered/utility.py",
    "content": "import os\nimport errno\nimport functools\nimport itertools\n\n\ndef repeated(iterable, times):\n    for _ in range(times):\n        yield from iterable\n\n\ndef batched(iterable, size):\n    batch = []\n    for element in iterable:\n        batch.append(element)\n        if len(batch) == size:\n            yield batch\n            batch = []\n    if batch:\n        yield batch\n\n\ndef averaged(callable_, batch):\n    overall = None\n    for element in batch:\n        current = callable_(element)\n        overall = overall + current if overall else current\n    return overall / len(batch)\n\n\ndef listify(fn=None, wrapper=list):\n    \"\"\"\n    From http://stackoverflow.com/a/12377059/1079110\n    \"\"\"\n    def listify_return(fn):\n        @functools.wraps(fn)\n        def listify_helper(*args, **kw):\n            return wrapper(fn(*args, **kw))\n        return listify_helper\n\n    if fn is None:\n        return listify_return\n    return listify_return(fn)\n\n\ndef ensure_folder(path):\n    try:\n        os.makedirs(path)\n    except OSError as e:\n        if e.errno == errno.EEXIST:\n            return\n        raise\n\n\ndef hstack_lines(blocks, sep=' '):\n    blocks = [x.split('\\n') for x in blocks]\n    height = max(len(block) for block in blocks)\n    widths = [max(len(line) for line in block) for block in blocks]\n    output = ''\n    for y in range(height):\n        for x, w in enumerate(widths):\n            cell = blocks[x][y] if y < len(blocks[x]) else ''\n            output += cell.rjust(w, ' ') + sep\n        output += '\\n'\n    return output\n\n\ndef pairwise(iterable):\n    a, b = itertools.tee(iterable)\n    next(b, None)\n    return zip(a, b)\n"
  },
  {
    "path": "problem/mnist-relu-batch.yaml",
    "content": "# 2.12%\ndataset: Mnist\ncost: CrossEntropy\nlayers:\n- activation: Identity\n  size: 784\n- activation: Relu\n  size: 700\n- activation: Relu\n  size: 700\n- activation: Relu\n  size: 400\n- activation: Softmax\n  size: 10\nepochs: 5\nbatch_size: 32\nlearning_rate: 0.01\nmomentum: 0.9\nweight_scale: 0.01\nweight_decay: 0\nevaluate_every: 5000\n"
  },
  {
    "path": "problem/mnist-relu-online.yaml",
    "content": "# 2.59%\ndataset: Mnist\ncost: CrossEntropy\nlayers:\n- activation: Identity\n  size: 784\n- activation: Relu\n  size: 700\n- activation: Relu\n  size: 400\n- activation: Softmax\n  size: 10\nepochs: 2\nlearning_rate: 0.001\nmomentum: 0\nweight_scale: 0.01\nweight_decay: 0\nevaluate_every: 5000\n"
  },
  {
    "path": "problem/modulo.yaml",
    "content": "dataset: Modulo\ncost: CrossEntropy\nlayers:\n- activation: Identity\n  size: 32\n- activation: Max\n  size: 64\n- activation: Max\n  size: 64\n- activation: Softmax\n  size: 7\nepochs: 5\nlearning_rate: 0.01\nweight_scale: 0.1\nevaluate_every: 5000\n"
  },
  {
    "path": "problem/sparse-field-batch.yaml",
    "content": "# 8.57%\ndataset: Mnist\ncost: CrossEntropy\nlayers:\n- activation: Identity\n  size: 784\n- activation: SparseField\n  size: 625\n- activation: SparseField\n  size: 625\n- activation: Softmax\n  size: 10\nepochs: 5\nlearning_rate: 0.1\nmomentum: 0.9\nbatch_size: 100\nweight_scale: 0.001\nweight_mean: 0.001\nweight_decay: 0\nevaluate_every: 5000\n"
  },
  {
    "path": "problem/sparse-field-online.yaml",
    "content": "# 6.42%\ndataset: Mnist\ncost: CrossEntropy\nlayers:\n- activation: Identity\n  size: 784\n- activation: SparseField\n  size: 625\n- activation: SparseField\n  size: 625\n- activation: Softmax\n  size: 10\nepochs: 5\nlearning_rate: 0.01\nmomentum: 0.0\nbatch_size: 1\nweight_scale: 0.001\nweight_mean: 0.002\nweight_decay: 0\nevaluate_every: 5000\n"
  },
  {
    "path": "problem/sparse-max.yaml",
    "content": "# 15.83%\ndataset: Mnist\ncost: CrossEntropy\nlayers:\n- activation: Identity\n  size: 784\n- activation: SparseRange\n  size: 1000\n- activation: SparseRange\n  size: 1000\n- activation: Softmax\n  size: 10\nepochs: 10\nlearning_rate: 0.01\nmomentum: 0\nbatch_size: 1\nweight_scale: 0.001\nweight_mean: 0\nweight_decay: 0\nevaluate_every: 5000\n"
  },
  {
    "path": "problem/tying.yaml",
    "content": "dataset: Modulo\ncost: CrossEntropy\nlayers:\n- activation: Identity\n  size: 32\n- activation: Relu\n  size: 64\n- activation: Relu\n  size: 64\n- activation: Relu\n  size: 64\n- activation: Softmax\n  size: 7\nepochs: 1\nlearning_rate: 0.010\nweight_scale: 0.1\n# Tie together the two weight matrices\n# between the three Relu layers.\nweight_tying:\n- ['1,:,:', '2,:,:']\nevaluate_every: 5000\n"
  },
  {
    "path": "pylintrc",
    "content": "[MESSAGES CONTROL]\n\ndisable=\n    locally-disabled,\n    too-many-instance-attributes,\n    missing-docstring,\n    fixme,\n    too-few-public-methods,\n    invalid-name,\n    no-member,\n    redefined-outer-name\n\n[REPORTS]\n\nreports=no\n\n[BASIC]\n\ndocstring-min-length=2\n"
  },
  {
    "path": "setup.py",
    "content": "import os\nimport sys\nimport subprocess\nimport setuptools\nfrom setuptools.command.build_ext import build_ext\nfrom setuptools.command.test import test\n\n\nclass TestCommand(test):\n\n    description = 'run tests, linters and create a coverage report'\n    user_options = []\n\n    def __init__(self, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n        self.returncode = 0\n\n    def finalize_options(self):\n        super().finalize_options()\n        # New setuptools don't need this anymore, thus the try block.\n        try:\n            # pylint: disable=attribute-defined-outside-init\n            self.test_args = []\n            self.test_suite = 'True'\n        except AttributeError:\n            pass\n\n    def run_tests(self):\n        self._call('python -m pytest --cov=layered test')\n        self._call('python -m pylint layered')\n        self._call('python -m pylint test')\n        self._call('python -m pylint setup.py')\n        self._check()\n\n    def _call(self, command):\n        env = os.environ.copy()\n        env['PYTHONPATH'] = ''.join(':' + x for x in sys.path)\n        print('Run command', command)\n        try:\n            subprocess.check_call(command.split(), env=env)\n        except subprocess.CalledProcessError as error:\n            print('Command failed with exit code', error.returncode)\n            self.returncode = 1\n\n    def _check(self):\n        if self.returncode:\n            sys.exit(self.returncode)\n\n\nclass BuildExtCommand(build_ext):\n    \"\"\"\n    Fix Numpy build error when bundled as a dependency.\n    From http://stackoverflow.com/a/21621689/1079110\n    \"\"\"\n\n    def finalize_options(self):\n        super().finalize_options()\n        __builtins__.__NUMPY_SETUP__ = False\n        import numpy\n        self.include_dirs.append(numpy.get_include())\n\n\nDESCRIPTION = 'Clean reference implementation of feed forward neural networks'\n\nSETUP_REQUIRES = [\n    'numpy',\n    'sphinx',\n]\n\nINSTALL_REQUIRES = [\n    'PyYAML',\n    'numpy',\n    'matplotlib',\n]\n\nTESTS_REQUIRE = [\n    'pytest',\n    'pytest-cov',\n    'pylint',\n]\n\n\nif __name__ == '__main__':\n    setuptools.setup(\n        name='layered',\n        version='0.1.8',\n        description=DESCRIPTION,\n        url='http://github.com/danijar/layered',\n        author='Danijar Hafner',\n        author_email='mail@danijar.com',\n        license='MIT',\n        packages=['layered'],\n        setup_requires=SETUP_REQUIRES,\n        install_requires=INSTALL_REQUIRES,\n        tests_require=TESTS_REQUIRE,\n        cmdclass={'test': TestCommand, 'build_ext': BuildExtCommand},\n        entry_points={'console_scripts': ['layered=layered.__main__:main']},\n    )\n"
  },
  {
    "path": "test/__init__.py",
    "content": ""
  },
  {
    "path": "test/fixtures.py",
    "content": "import numpy as np\nimport pytest\nfrom layered.activation import Identity, Relu, Sigmoid, Softmax\nfrom layered.cost import SquaredError, CrossEntropy\nfrom layered.network import Matrices, Layer, Network\nfrom layered.utility import pairwise\nfrom layered.example import Example\n\n\ndef random_matrices(shapes):\n    np.random.seed(0)\n    matrix = Matrices(shapes)\n    matrix.flat = np.random.normal(0, 0.1, len(matrix.flat))\n    return matrix\n\n\n@pytest.fixture(params=[(5, 5, 6, 3)])\ndef weights(request):\n    shapes = list(pairwise(request.param))\n    weights = random_matrices(shapes)\n    return weights\n\n\n@pytest.fixture(params=[(5, 5, 6, 3)])\ndef weights_and_gradient(request):\n    shapes = list(pairwise(request.param))\n    weights = random_matrices(shapes)\n    gradient = random_matrices(shapes)\n    return weights, gradient\n\n\n@pytest.fixture(params=[Identity, Relu, Sigmoid, Softmax])\ndef network_and_weights(request):\n    np.random.seed(0)\n    layers = [Layer(5, Identity)] + [Layer(5, request.param) for _ in range(3)]\n    network = Network(layers)\n    weights = Matrices(network.shapes)\n    weights.flat = np.random.normal(0, 0.01, len(weights.flat))\n    return network, weights\n\n\n@pytest.fixture\ndef example():\n    data = np.array(range(5))\n    label = np.array(range(5))\n    return Example(data, label)\n\n\n@pytest.fixture\ndef examples():\n    examples = []\n    for i in range(7):\n        data = np.array(range(5)) + i\n        label = np.array(range(5)) + i\n        examples.append(Example(data, label))\n    return examples\n\n\n@pytest.fixture(params=[SquaredError, CrossEntropy])\ndef cost(request):\n    return request.param()\n"
  },
  {
    "path": "test/test_example.py",
    "content": "# pylint: disable=no-self-use\nimport numpy as np\nfrom layered.example import Example\n\n\nclass TestExample:\n\n    def test_representation(self):\n        data = np.array([1, 2, 3])\n        target = np.array([1, 2, 3])\n        example = Example(data, target)\n        repr(example)\n"
  },
  {
    "path": "test/test_gradient.py",
    "content": "# pylint: disable=no-self-use, wildcard-import, unused-wildcard-import\nfrom layered.activation import Identity, Relu\nfrom layered.cost import CrossEntropy\nfrom layered.gradient import (\n    NumericalGradient, Backprop, BatchBackprop, ParallelBackprop)\nfrom test.fixtures import *\n\n\nclass TestBackprop:\n\n    def test_against_numerical(self, network_and_weights, cost, example):\n        network, weights = network_and_weights\n        if isinstance(cost, CrossEntropy) and isinstance(\n                network.layers[1].activation, (Identity, Relu)):\n            pytest.xfail(\n                'Cross entropy doesn\\'t work with linear activations for some '\n                'reason.')\n        backprop = Backprop(network, cost)\n        numerical = NumericalGradient(network, cost)\n        gradient = backprop(weights, example)\n        reference = numerical(weights, example)\n        assert np.allclose(gradient, reference)\n\n\nclass TestBatchBackprop:\n\n    def test_calculation(self, network_and_weights, cost, examples):\n        network, weights = network_and_weights\n        batched = BatchBackprop(network, cost)\n        backprop = Backprop(network, cost)\n        gradient = batched(weights, examples)\n        reference = sum(backprop(weights, x) for x in examples) / len(examples)\n        assert np.allclose(gradient, reference)\n\n\nclass TestParallelBachprop:\n\n    def test_against_batch_backprop(self, network_and_weights, cost, examples):\n        network, weights = network_and_weights\n        parallel = ParallelBackprop(network, cost)\n        batched = BatchBackprop(network, cost)\n        gradient = parallel(weights, examples)\n        reference = batched(weights, examples)\n        assert np.allclose(gradient, reference)\n"
  },
  {
    "path": "test/test_network.py",
    "content": "# pylint: disable=no-self-use\nimport numpy as np\nimport pytest\nfrom layered.network import Matrices\n\n\n@pytest.fixture\ndef matrices():\n    return Matrices([(5, 8), (4, 2)])\n\n\nclass TestMatrices:\n\n    def test_initialization(self, matrices):\n        assert np.array_equal(matrices[0], np.zeros((5, 8)))\n        assert np.array_equal(matrices[1], np.zeros((4, 2)))\n\n    def test_indexing(self, matrices):\n        for index, matrix in enumerate(matrices):\n            for (x, y), _ in np.ndenumerate(matrix):\n                assert matrices[index][x, y] == matrices[index, x, y]\n\n    def test_slicing(self, matrices):\n        for index, matrix in enumerate(matrices):\n            assert (matrices[index][:, :] == matrices[index, :, :]).all()\n            assert (matrices[index][:, :] == matrix[:, :]).all()\n\n    def test_negative_indices(self, matrices):\n        for i in range(len(matrices)):\n            positive = matrices[len(matrices) - i - 1]\n            negative = matrices[i - 1]\n            assert negative.shape == positive.shape\n            assert (negative == positive).all()\n\n    def test_assignment(self, matrices):\n        matrices[0, 4, 5] = 42\n        assert matrices[0, 4, 5] == 42\n\n    def test_matrix_assignment(self, matrices):\n        np.random.seed(0)\n        matrix = np.random.rand(*matrices.shapes[0])\n        matrices[0] = matrix\n        assert (matrices[0] == matrix).all()\n\n    def test_sliced_matrix_assignment(self, matrices):\n        np.random.seed(0)\n        matrix = np.random.rand(*matrices.shapes[0])\n        matrices[0][:, :] = matrix\n        assert (matrices[0] == matrix).all()\n        matrices[0, :, :] = matrix\n        assert (matrices[0] == matrix).all()\n\n    def test_invalid_matrix_assignment(self, matrices):\n        np.random.seed(0)\n        shape = matrices.shapes[0]\n        matrix = np.random.rand(shape[0] + 1, shape[1])\n        with pytest.raises(ValueError):\n            matrices[0] = matrix\n"
  },
  {
    "path": "test/test_optimization.py",
    "content": "# pylint: disable=wildcard-import, unused-wildcard-import, no-self-use\nimport numpy as np\nimport pytest\nfrom layered import optimization\nfrom test.fixtures import *\n\n\n@pytest.fixture(params=[(1, 1), (1, 2), (2, 1), (2, 2), (4, 5)])\ndef weights_and_gradient_and_groups(request):\n    size, layers = request.param\n    shapes = [(size, size)] * layers\n    weights = random_matrices(shapes)\n    gradient = random_matrices(shapes)\n    slices = [np.s_[i, :, :] for i, x in enumerate(weights)]\n    groups = (slices,)\n    return weights, gradient, groups\n\n\nclass TestGradientDecent:\n\n    def test_calculation(self, weights_and_gradient):\n        weights, gradient = weights_and_gradient\n        decent = optimization.GradientDecent()\n        updated = decent(weights, gradient, 0.1)\n        reference = weights - 0.1 * gradient\n        assert np.allclose(updated, reference)\n\n    def test_shapes_match(self, weights_and_gradient):\n        weights, gradient = weights_and_gradient\n        decent = optimization.GradientDecent()\n        updated = decent(weights, gradient, 0.1)\n        assert weights.shapes == updated.shapes\n\n    def test_copy_data(self, weights_and_gradient):\n        weights, gradient = weights_and_gradient\n        decent = optimization.GradientDecent()\n        before = weights.copy()\n        updated = decent(weights, gradient, 0.1)\n        assert (before.flat == weights.flat).all()\n        assert updated.flat[0] != 42\n        weights.flat[0] = 42\n        assert updated.flat[0] != 42\n\n\nclass TestMomentum:\n\n    def test_zero_rate(self, weights_and_gradient):\n        _, gradient = weights_and_gradient\n        original = gradient\n        momentum = optimization.Momentum()\n        for _ in range(5):\n            gradient = momentum(gradient, rate=0)\n        assert np.allclose(gradient, original)\n\n    def test_shapes_match(self, weights):\n        momentum = optimization.Momentum()\n        updated = momentum(weights, 0.9)\n        assert weights.shapes == updated.shapes\n\n    def test_copy_data(self, weights):\n        momentum = optimization.Momentum()\n        before = weights.copy()\n        updated = momentum(weights, 0.1)\n        assert (before.flat == weights.flat).all()\n        assert updated.flat[0] != 42\n        weights.flat[0] = 42\n        assert updated.flat[0] != 42\n\n\nclass TestWeightDecay:\n\n    def test_calculation(self, weights):\n        decay = optimization.WeightDecay()\n        updated = decay(weights, 0.1)\n        reference = 0.9 * weights\n        assert np.allclose(updated, reference)\n\n    def test_shapes_match(self, weights):\n        decay = optimization.WeightDecay()\n        updated = decay(weights, 0.1)\n        assert weights.shapes == updated.shapes\n\n    def test_copy_data(self, weights):\n        decay = optimization.WeightDecay()\n        before = weights.copy()\n        updated = decay(weights, 0.1)\n        assert (before.flat == weights.flat).all()\n        assert updated.flat[0] != 42\n        weights.flat[0] = 42\n        assert updated.flat[0] != 42\n\n\nclass TestWeightTying:\n\n    def test_calculation(self, weights_and_gradient_and_groups):\n        weights, _, groups = weights_and_gradient_and_groups\n        tying = optimization.WeightTying(*groups)\n        updated = tying(weights)\n        self._is_tied(updated, groups)\n\n    def test_shapes_match(self, weights_and_gradient_and_groups):\n        weights, _, groups = weights_and_gradient_and_groups\n        tying = optimization.WeightTying(*groups)\n        updated = tying(weights)\n        assert weights.shapes == updated.shapes\n\n    def test_dont_affect_others(self, weights_and_gradient_and_groups):\n        weights, _, _ = weights_and_gradient_and_groups\n        if len(weights.shapes) < 2:\n            pytest.skip()\n        group = (np.s_[0, :, :], np.s_[1, :, :])\n        tying = optimization.WeightTying(group)\n        updated = tying(weights)\n        assert (updated[0] == updated[1]).all()\n        for before, after in zip(weights[2:], updated[2:]):\n            assert (before == after).all()\n\n    def test_weights_stay_tied(self, weights_and_gradient_and_groups):\n        weights, gradient, groups = weights_and_gradient_and_groups\n        tying = optimization.WeightTying(*groups)\n        decent = optimization.GradientDecent()\n        weights = tying(weights)\n        weights = decent(weights, gradient, 0.1)\n        self._is_tied(weights, groups)\n\n    def test_copy_data(self, weights_and_gradient_and_groups):\n        weights, _, groups = weights_and_gradient_and_groups\n        tying = optimization.WeightTying(*groups)\n        before = weights.copy()\n        updated = tying(weights)\n        assert (before.flat == weights.flat).all()\n        assert updated.flat[0] != 42\n        weights.flat[0] = 42\n        assert updated.flat[0] != 42\n\n    def _is_tied(self, matrices, groups):\n        for group in groups:\n            slices = [matrices[x] for x in group]\n            assert [np.allclose(x, slices[0]) for x in slices]\n"
  },
  {
    "path": "test/test_plot.py",
    "content": "# pylint: disable=no-self-use\n\n\nclass TestPlot:\n\n    def test_interactive_backend(self):\n        import matplotlib\n        matplotlib.use('TkAgg')\n"
  },
  {
    "path": "test/test_problem.py",
    "content": "# pylint: disable=no-self-use\nimport pytest\nfrom layered.problem import Problem\n\n\nclass TestProblem:\n\n    def test_unknown_property(self):\n        with pytest.raises(Exception):\n            Problem('foo: 42')\n\n    def test_incompatible_type(self):\n        with pytest.raises(Exception):\n            Problem('learning_rate: foo')\n\n    def test_read_value(self):\n        problem = Problem('learning_rate: 0.4')\n        assert problem.learning_rate == 0.4\n\n    def test_default_value(self):\n        problem = Problem(' ')\n        print(problem)\n        assert problem.learning_rate == 0.1\n"
  },
  {
    "path": "test/test_trainer.py",
    "content": "# pylint: disable=no-self-use\nimport pytest\nfrom layered.trainer import Trainer\nfrom layered.problem import Problem\n\n\n@pytest.fixture\ndef problem():\n    return Problem(\n        \"\"\"\n        dataset: Test\n        layers:\n        - activation: Identity\n          size: 3\n        \"\"\")\n\n\nclass TestTrainer:\n\n    def test_no_crash(self, problem):\n        trainer = Trainer(problem)\n        trainer()\n"
  },
  {
    "path": "test/test_utility.py",
    "content": "# pylint: disable=no-self-use\nimport random\nfrom layered.utility import repeated, batched, averaged\n\n\nclass MockGenerator:\n\n    def __init__(self, data):\n        self.data = data\n        self.evaluated = 0\n\n    def __iter__(self):\n        for element in self.data:\n            self.evaluated += 1\n            yield element\n\n\nclass MockCustomOperators:\n\n    def __init__(self, value):\n        self.value = value\n\n    def __add__(self, other):\n        return MockCustomOperators(self.value + other.value)\n\n    __radd__ = __add__\n\n    def __truediv__(self, other):\n        return MockCustomOperators(self.value / other)\n\n\nclass TestRepeated:\n\n    def test_result(self):\n        iterable = range(14)\n        repeats = repeated(iterable, 3)\n        assert list(repeats) == list(iterable) * 3\n\n    def test_generator(self):\n        iterable = MockGenerator([1, 2, 3])\n        repeats = repeated(iterable, 3)\n        assert iterable.evaluated == 0\n        list(repeats)\n        assert iterable.evaluated == 3 * 3\n\n\nclass TestBatched:\n\n    def test_result(self):\n        # pylint: disable=redefined-variable-type\n        iterable = range(14)\n        batches = batched(iterable, 3)\n        batches = list(batches)\n        assert len(batches) == 5\n        assert len(batches[0]) == 3\n        assert len(batches[-1]) == 2\n\n    def test_generator(self):\n        iterable = MockGenerator([1, 2, 3])\n        batches = batched(iterable, 3)\n        assert iterable.evaluated == 0\n        list(batches)\n        assert iterable.evaluated == 3\n\n\nclass TestAveraged:\n\n    def test_result(self):\n        assert averaged(lambda x: x, [1, 2, 3, 4]) == 2.5\n        assert averaged(lambda x: x ** 2, [1, 2, 3, 4]) == 7.5\n\n    def test_custom_operators(self):\n        iterable = [MockCustomOperators(i) for i in range(1, 5)]\n        assert averaged(lambda x: x, iterable).value == 2.5\n\n    def test_supports_booleans(self):\n        iterable = [True] * 5 + [False] * 5\n        random.shuffle(iterable)\n        assert averaged(lambda x: x, iterable) == 0.5\n"
  }
]