Full Code of parasdahal/deepnet for AI

master 51a9e61c3511 cached

13 files

44.1 KB

14.0k tokens

73 symbols

1 requests

Download .txt

Repository: parasdahal/deepnet
Branch: master
Commit: 51a9e61c3511
Files: 13
Total size: 44.1 KB

Directory structure:
gitextract_e4h7xye1/

├── .gitignore
├── LICENSE
├── README.md
├── deepnet/
│   ├── Gradient Checking.ipynb
│   ├── im2col.py
│   ├── layers.py
│   ├── loss.py
│   ├── nnet.py
│   ├── solver.py
│   └── utils.py
├── requirements.txt
├── run_cnn.py
└── run_rnn.py

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
*.egg-info/
.installed.cfg
*.egg

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*,cover
.hypothesis/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# IPython Notebook
.ipynb_checkpoints

# pyenv
.python-version

# celery beat schedule file
celerybeat-schedule

# dotenv
.env

# virtualenv
venv/
ENV/

# Spyder project settings
.spyderproject

# Rope project settings
.ropeproject

*.sublime*
MNIST_data/

================================================
FILE: LICENSE
================================================
The MIT License (MIT)

Copyright (c) 2016 Paras Dahal

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

================================================
FILE: README.md
================================================
# deepnet

Implementations of CNNs, RNNs and cool new techniques in deep learning

Note: deepnet is a work in progress and things will be added gradually. It is not intended for production, use it to learn and study implementations of latest and greatest in deep learning.

## What does it have?

**Network Architecture**
1. Convolutional net
2. Feed forward net
3. Recurrent net (LSTM/GRU coming soon)

**Optimization Algorithms**
1. SGD
2. SGD with momentum
3. Nesterov Accelerated Gradient
4. Adagrad
5. RMSprop
6. Adam

**Regularization**
1. Dropout
2. L1 and L2 Regularization

**Cool Techniques**

1. BatchNorm
2. Xavier Weight Initialization

**Nonlinearities**
1. ReLU
2. Sigmoid
3. tanh


## Usage

1. ```virtualenv .env``` ; create a virtual environment
2. ```source .env/bin/activate``` ; activate the virtual environment
3. ```pip install -r requirements.txt``` ; Install dependencies
4. ```python run_cnn.py {mnist|cifar10}``` ; mnist for shallow cnn and cifar10 for deep cnn

================================================
FILE: deepnet/Gradient Checking.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Numerical Gradient checking of Layers\n",
    "\n",
    "Verify the correctness of implementation using Gradient checks provided in CS231 2nd assignment.\n",
    "\n",
    "1. **Probably Wrong**: relative error > 1e-2 \n",
    "2. **Something not right** :1e-2 > relative error > 1e-4 \n",
    "3. **Okay for objectives with kinks**: 1e-4 > relative error, if no kinks then too high\n",
    "4. **Most likely Right**: relative error < 1e-7 "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from layers import *\n",
    "from loss import SoftmaxLoss\n",
    "from nnet import NeuralNet\n",
    "from solver import sgd,sgd_momentum,adam\n",
    "import sys"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Numerical Gradient Functions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def rel_error(x, y):\n",
    "  \"\"\" returns relative error \"\"\"\n",
    "  return np.max(np.abs(x - y) / (np.maximum(1e-8, np.abs(x) + np.abs(y))))\n",
    "\n",
    "def numerical_gradient_array(f, x, df, h=1e-5):\n",
    "  \"\"\"\n",
    "  Evaluate a numeric gradient for a function that accepts a numpy\n",
    "  array and returns a numpy array.\n",
    "  \"\"\"\n",
    "  grad = np.zeros_like(x)\n",
    "  it = np.nditer(x, flags=['multi_index'], op_flags=['readwrite'])\n",
    "  while not it.finished:\n",
    "\n",
    "    ix = it.multi_index\n",
    "    oldval = x[ix]\n",
    "    x[ix] = oldval + h\n",
    "    pos = f(x).copy()\n",
    "    x[ix] = oldval - h\n",
    "    neg = f(x).copy()\n",
    "    x[ix] = oldval\n",
    "\n",
    "    grad[ix] = np.sum((pos - neg) * df) / (2 * h)\n",
    "\n",
    "    it.iternext()\n",
    "  return grad\n",
    "\n",
    "def eval_numerical_gradient(f, x, verbose=True, h=0.00001):\n",
    "  \"\"\"\n",
    "  a naive implementation of numerical gradient of f at x\n",
    "  - f should be a function that takes a single argument\n",
    "  - x is the point (numpy array) to evaluate the gradient at\n",
    "  \"\"\"\n",
    "\n",
    "  fx = f(x) # evaluate function value at original point\n",
    "\n",
    "  grad = np.zeros_like(x)\n",
    "  # iterate over all indexes in x\n",
    "  it = np.nditer(x, flags=['multi_index'], op_flags=['readwrite'])\n",
    "  while not it.finished:\n",
    "    # evaluate function at x+h\n",
    "    ix = it.multi_index\n",
    "    oldval = x[ix]\n",
    "    x[ix] = oldval + h # increment by h\n",
    "    fxph = f(x) # evalute f(x + h)\n",
    "    x[ix] = oldval - h\n",
    "    fxmh = f(x) # evaluate f(x - h)\n",
    "    x[ix] = oldval # restore\n",
    "\n",
    "    # compute the partial derivative with centered formula\n",
    "    grad[ix] = (fxph - fxmh) / (2 * h) # the slope\n",
    "    if verbose:\n",
    "      print(ix, grad[ix])\n",
    "    it.iternext() # step to next dimension\n",
    "\n",
    "  return grad"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Convolution Layer\n",
    "\n",
    "Perform numerical grdient checking for verifying the implementation of convolution layer."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Forward Pass\n",
    "\n",
    "The difference of correct_out and out should be around 1e-8"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Testing forward pass of Conv Layer\n",
      "Difference:  2.21214764967e-08\n"
     ]
    }
   ],
   "source": [
    "x_shape = (2, 3, 4, 4)\n",
    "w_shape = (3, 3, 4, 4)\n",
    "x = np.linspace(-0.1, 0.5, num=np.prod(x_shape)).reshape(x_shape)\n",
    "w = np.linspace(-0.2, 0.3, num=np.prod(w_shape)).reshape(w_shape)\n",
    "b = np.linspace(-0.1, 0.2, num=3)\n",
    "\n",
    "c_layer = Conv((3,4,4),n_filter=3,h_filter=4,w_filter=4,stride=2,padding=1)\n",
    "c_layer.W = w\n",
    "c_layer.b = b.reshape(-1,1)\n",
    "\n",
    "correct_out = np.array([[[[-0.08759809, -0.10987781],\n",
    "                           [-0.18387192, -0.2109216 ]],\n",
    "                          [[ 0.21027089,  0.21661097],\n",
    "                           [ 0.22847626,  0.23004637]],\n",
    "                          [[ 0.50813986,  0.54309974],\n",
    "                           [ 0.64082444,  0.67101435]]],\n",
    "                         [[[-0.98053589, -1.03143541],\n",
    "                           [-1.19128892, -1.24695841]],\n",
    "                          [[ 0.69108355,  0.66880383],\n",
    "                           [ 0.59480972,  0.56776003]],\n",
    "                          [[ 2.36270298,  2.36904306],\n",
    "                           [ 2.38090835,  2.38247847]]]])\n",
    "\n",
    "out = c_layer.forward(x)\n",
    "\n",
    "error = rel_error(out,correct_out)\n",
    "print(\"Testing forward pass of Conv Layer\")\n",
    "print(\"Difference: \",error)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Backward pass\n",
    "\n",
    "The errors for gradients should be around 1e-9"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Testing backward pass of Conv Layer\n",
      "dX error:  6.30285589596e-09\n",
      "dW error:  3.66468373932e-10\n",
      "db error:  6.8390384471e-12\n"
     ]
    }
   ],
   "source": [
    "x = np.random.randn(4, 3, 5, 5)\n",
    "w = np.random.randn(2, 3, 3, 3)\n",
    "b = np.random.randn(2,).reshape(-1,1)\n",
    "dout = np.random.randn(4, 2, 5, 5)\n",
    "\n",
    "c_layer = Conv((3,5,5),n_filter=2,h_filter=3,w_filter=3,stride=1,padding=1)\n",
    "c_layer.W = w\n",
    "c_layer.b = b\n",
    "\n",
    "dx_num = numerical_gradient_array(lambda x: c_layer.forward(x), x, dout)\n",
    "dw_num = numerical_gradient_array(lambda w: c_layer.forward(x), w, dout)\n",
    "db_num = numerical_gradient_array(lambda b: c_layer.forward(x), b, dout)\n",
    "\n",
    "out = c_layer.forward(x)\n",
    "dx,grads = c_layer.backward(dout)\n",
    "dw,db = grads\n",
    "\n",
    "print(\"Testing backward pass of Conv Layer\")\n",
    "print(\"dX error: \",rel_error(dx,dx_num))\n",
    "print(\"dW error: \",rel_error(dw,dw_num))\n",
    "print(\"db error: \",rel_error(db,db_num))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Maxpool Layer\n",
    "\n",
    "Perform gradient check for maxpool layer and verify correctness of its implementation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Forward Pass\n",
    "\n",
    "Difference should be around 1e-8"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Testing max_pool_forward_naive function:\n",
      "difference:  4.16666651573e-08\n"
     ]
    }
   ],
   "source": [
    "x_shape = (2, 3, 4, 4)\n",
    "x = np.linspace(-0.3, 0.4, num=np.prod(x_shape)).reshape(x_shape)\n",
    "\n",
    "pool = Maxpool((3,4,4),size=2,stride=2)\n",
    "\n",
    "out = pool.forward(x,)\n",
    "correct_out = np.array([[[[-0.26315789, -0.24842105],\n",
    "                          [-0.20421053, -0.18947368]],\n",
    "                         [[-0.14526316, -0.13052632],\n",
    "                          [-0.08631579, -0.07157895]],\n",
    "                         [[-0.02736842, -0.01263158],\n",
    "                          [ 0.03157895,  0.04631579]]],\n",
    "                        [[[ 0.09052632,  0.10526316],\n",
    "                          [ 0.14947368,  0.16421053]],\n",
    "                         [[ 0.20842105,  0.22315789],\n",
    "                          [ 0.26736842,  0.28210526]],\n",
    "                         [[ 0.32631579,  0.34105263],\n",
    "                          [ 0.38526316,  0.4       ]]]])\n",
    "\n",
    "print('Testing max_pool_forward_naive function:')\n",
    "print('difference: ', rel_error(out, correct_out))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Backward Pass\n",
    "\n",
    "Error should be around 1e-12"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Testing bacward pass of Maxpool layer\n",
      "dX error:  3.27561819731e-12\n"
     ]
    }
   ],
   "source": [
    "x = np.random.randn(3, 2, 8, 8)\n",
    "dout = np.random.randn(3, 2, 4, 4)\n",
    "\n",
    "pool = Maxpool((2,8,8),size=2,stride=2)\n",
    "\n",
    "dx_num = numerical_gradient_array(lambda x: pool.forward(x), x, dout)\n",
    "\n",
    "out = pool.forward(x)\n",
    "dx,_ = pool.backward(dout)\n",
    "\n",
    "print('Testing bacward pass of Maxpool layer')\n",
    "print('dX error: ', rel_error(dx, dx_num))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## ReLU Layer\n",
    "Error should be around 1e-12"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Testing backward pass of ReLU layer\n",
      "dX error:  3.275621976e-12\n"
     ]
    }
   ],
   "source": [
    "x = np.random.randn(3, 2, 8, 8)\n",
    "dout = np.random.randn(3, 2, 8, 8)\n",
    "\n",
    "r = ReLU()\n",
    "\n",
    "dx_num = numerical_gradient_array(lambda x:r.forward(x), x, dout)\n",
    "\n",
    "out = r.forward(x)\n",
    "dx,_ = r.backward(dout)\n",
    "\n",
    "print('Testing backward pass of ReLU layer')\n",
    "print('dX error: ',rel_error(dx,dx_num))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Conv-ReLU-MaxPool"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Testing conv_relu_pool\n",
      "dx error:  1.01339343448e-08\n",
      "dw error:  7.41563088659e-10\n",
      "db error:  7.51304173633e-11\n"
     ]
    }
   ],
   "source": [
    "x = np.random.randn(2, 3, 16, 16)\n",
    "w = np.random.randn(3, 3, 3, 3)\n",
    "b = np.random.randn(3,).reshape(-1,1)\n",
    "dout = np.random.randn(2, 3, 8, 8)\n",
    "\n",
    "c = Conv((3,16,16),n_filter=3,h_filter=3,w_filter=3,stride=1,padding=1)\n",
    "c.W, c.b = w, b\n",
    "r = ReLU()\n",
    "m = Maxpool(c.out_dim,size=2,stride=2)\n",
    "\n",
    "def conv_relu_pool_forward(c,r,m,x):\n",
    "    c_out = c.forward(x)\n",
    "    r_out = r.forward(c_out)\n",
    "    m_out = m.forward(r_out)\n",
    "    return m_out\n",
    "\n",
    "dx_num = numerical_gradient_array(lambda x: conv_relu_pool_forward(c,r,m,x), x, dout)\n",
    "dw_num = numerical_gradient_array(lambda w: conv_relu_pool_forward(c,r,m,x), w, dout)\n",
    "db_num = numerical_gradient_array(lambda b: conv_relu_pool_forward(c,r,m,x), b, dout)\n",
    "\n",
    "m_dx,_ = m.backward(dout)\n",
    "r_dx,_ = r.backward(m_dx)\n",
    "dx,grads = c.backward(r_dx)\n",
    "dw,db = grads\n",
    "\n",
    "\n",
    "print('Testing conv_relu_pool')\n",
    "print('dx error: ', rel_error(dx_num, dx))\n",
    "print('dw error: ', rel_error(dw_num, dw))\n",
    "print('db error: ', rel_error(db_num, db))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Fully Connected Layer"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[ 1.49834967  1.70660132  1.91485297]\n",
      " [ 3.25553199  3.5141327   3.77273342]]\n",
      "Testing fully connected forward pass:\n",
      "difference:  9.76985004799e-10\n"
     ]
    }
   ],
   "source": [
    "num_inputs = 2\n",
    "input_shape = (4, 5, 6)\n",
    "output_dim = 3\n",
    "\n",
    "input_size = num_inputs * np.prod(input_shape)\n",
    "weight_size = output_dim * np.prod(input_shape)\n",
    "\n",
    "x = np.linspace(-0.1, 0.5, num=input_size).reshape(num_inputs, *input_shape)\n",
    "w = np.linspace(-0.2, 0.3, num=weight_size).reshape(np.prod(input_shape), output_dim)\n",
    "b = np.linspace(-0.3, 0.1, num=output_dim).reshape(1,-1)\n",
    "\n",
    "flat = Flatten()\n",
    "x = flat.forward(x)\n",
    "\n",
    "f = FullyConnected(120,3)\n",
    "f.W,f.b= w,b\n",
    "out = f.forward(x)\n",
    "\n",
    "correct_out = np.array([[ 1.49834967,  1.70660132,  1.91485297],\n",
    "                        [ 3.25553199,  3.5141327,   3.77273342]])\n",
    "\n",
    "print(out)\n",
    "# Compare your output with ours. The error should be around 1e-9.\n",
    "print('Testing fully connected forward pass:')\n",
    "print('difference: ', rel_error(out, correct_out))\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Testing fully connected backward pass:\n",
      "dx error:  2.89903091526e-09\n",
      "dw error:  1.32127575542e-09\n",
      "db error:  1.03150657456e-11\n"
     ]
    }
   ],
   "source": [
    "x = np.random.randn(10, 2, 3)\n",
    "w = np.random.randn(6, 5)\n",
    "b = np.random.randn(5)\n",
    "dout = np.random.randn(10, 5)\n",
    "\n",
    "flat = Flatten()\n",
    "x = flat.forward(x)\n",
    "\n",
    "f = FullyConnected(60,5)\n",
    "f.W,f.b= w,b\n",
    "\n",
    "dx_num = numerical_gradient_array(lambda x: f.forward(x), x, dout)\n",
    "dw_num = numerical_gradient_array(lambda w: f.forward(x), w, dout)\n",
    "db_num = numerical_gradient_array(lambda b: f.forward(x), b, dout)\n",
    "\n",
    "dx,grads= f.backward(dout)\n",
    "dw, db = grads\n",
    "# The error should be around 1e-10\n",
    "print('Testing fully connected backward pass:')\n",
    "print('dx error: ', rel_error(dx_num, dx))\n",
    "print('dw error: ', rel_error(dw_num, dw))\n",
    "print('db error: ', rel_error(db_num, db))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Softmax Loss\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Testing SoftmaxLoss:\n",
      "loss:  2.30283790984\n",
      "dx error:  1.05396983612e-08\n"
     ]
    }
   ],
   "source": [
    "num_classes, num_inputs = 10, 50\n",
    "x = 0.001 * np.random.randn(num_inputs, num_classes)\n",
    "y = np.random.randint(num_classes, size=num_inputs)\n",
    "\n",
    "dx_num = eval_numerical_gradient(lambda x: SoftmaxLoss(x,y)[0], x,verbose=False)\n",
    "loss,dx = SoftmaxLoss(x,y)\n",
    "\n",
    "# Test softmax_loss function. Loss should be 2.3 and dx error should be 1e-8\n",
    "print('Testing SoftmaxLoss:')\n",
    "print('loss: ', loss)\n",
    "print('dx error: ', rel_error(dx_num, dx))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: deepnet/im2col.py
================================================
import numpy as np


def get_im2col_indices(x_shape, field_height=3, field_width=3, padding=1, stride=1):
  # First figure out what the size of the output should be
  N, C, H, W = x_shape
  assert (H + 2 * padding - field_height) % stride == 0
  assert (W + 2 * padding - field_height) % stride == 0
  out_height = (H + 2 * padding - field_height) / stride + 1
  out_width = (W + 2 * padding - field_width) / stride + 1

  i0 = np.repeat(np.arange(field_height,dtype='int32'), field_width)
  i0 = np.tile(i0, C)
  i1 = stride * np.repeat(np.arange(out_height,dtype='int32'), out_width)
  j0 = np.tile(np.arange(field_width), field_height * C)
  j1 = stride * np.tile(np.arange(out_width,dtype='int32'), int(out_height))
  i = i0.reshape(-1, 1) + i1.reshape(1, -1)
  j = j0.reshape(-1, 1) + j1.reshape(1, -1)

  k = np.repeat(np.arange(C,dtype='int32'), field_height * field_width).reshape(-1, 1)

  return (k, i, j)

def im2col_indices(x, field_height=3, field_width=3, padding=1, stride=1):
  """ An implementation of im2col based on some fancy indexing """
  # Zero-pad the input
  p = padding
  x_padded = np.pad(x, ((0, 0), (0, 0), (p, p), (p, p)), mode='constant')

  k, i, j = get_im2col_indices(x.shape, field_height, field_width, padding,
                               stride)

  cols = x_padded[:, k, i, j]
  C = x.shape[1]
  cols = cols.transpose(1, 2, 0).reshape(field_height * field_width * C, -1)
  return cols


def col2im_indices(cols, x_shape, field_height=3, field_width=3, padding=1,
                   stride=1):
  """ An implementation of col2im based on fancy indexing and np.add.at """
  N, C, H, W = x_shape
  H_padded, W_padded = H + 2 * padding, W + 2 * padding
  x_padded = np.zeros((N, C, H_padded, W_padded), dtype=cols.dtype)
  k, i, j = get_im2col_indices(x_shape, field_height, field_width, padding,
                               stride)
  cols_reshaped = cols.reshape(C * field_height * field_width, -1, N)
  cols_reshaped = cols_reshaped.transpose(2, 0, 1)
  np.add.at(x_padded, (slice(None), k, i, j), cols_reshaped)
  if padding == 0:
    return x_padded
  return x_padded[:, :, padding:-padding, padding:-padding]

pass

================================================
FILE: deepnet/layers.py
================================================
import numpy as np
from deepnet.im2col import *


class Conv():

    def __init__(self, X_dim, n_filter, h_filter, w_filter, stride, padding):

        self.d_X, self.h_X, self.w_X = X_dim

        self.n_filter, self.h_filter, self.w_filter = n_filter, h_filter, w_filter
        self.stride, self.padding = stride, padding

        self.W = np.random.randn(
            n_filter, self.d_X, h_filter, w_filter) / np.sqrt(n_filter / 2.)
        self.b = np.zeros((self.n_filter, 1))
        self.params = [self.W, self.b]

        self.h_out = (self.h_X - h_filter + 2 * padding) / stride + 1
        self.w_out = (self.w_X - w_filter + 2 * padding) / stride + 1

        if not self.h_out.is_integer() or not self.w_out.is_integer():
            raise Exception("Invalid dimensions!")

        self.h_out, self.w_out = int(self.h_out), int(self.w_out)
        self.out_dim = (self.n_filter, self.h_out, self.w_out)

    def forward(self, X):

        self.n_X = X.shape[0]

        self.X_col = im2col_indices(
            X, self.h_filter, self.w_filter, stride=self.stride, padding=self.padding)
        W_row = self.W.reshape(self.n_filter, -1)

        out = W_row @ self.X_col + self.b
        out = out.reshape(self.n_filter, self.h_out, self.w_out, self.n_X)
        out = out.transpose(3, 0, 1, 2)
        return out

    def backward(self, dout):

        dout_flat = dout.transpose(1, 2, 3, 0).reshape(self.n_filter, -1)

        dW = dout_flat @ self.X_col.T
        dW = dW.reshape(self.W.shape)

        db = np.sum(dout, axis=(0, 2, 3)).reshape(self.n_filter, -1)

        W_flat = self.W.reshape(self.n_filter, -1)

        dX_col = W_flat.T @ dout_flat
        shape = (self.n_X, self.d_X, self.h_X, self.w_X)
        dX = col2im_indices(dX_col, shape, self.h_filter,
                            self.w_filter, self.padding, self.stride)

        return dX, [dW, db]


class Maxpool():

    def __init__(self, X_dim, size, stride):

        self.d_X, self.h_X, self.w_X = X_dim

        self.params = []

        self.size = size
        self.stride = stride

        self.h_out = (self.h_X - size) / stride + 1
        self.w_out = (self.w_X - size) / stride + 1

        if not self.h_out.is_integer() or not self.w_out.is_integer():
            raise Exception("Invalid dimensions!")

        self.h_out, self.w_out = int(self.h_out), int(self.w_out)
        self.out_dim = (self.d_X, self.h_out, self.w_out)

    def forward(self, X):
        self.n_X = X.shape[0]
        X_reshaped = X.reshape(
            X.shape[0] * X.shape[1], 1, X.shape[2], X.shape[3])

        self.X_col = im2col_indices(
            X_reshaped, self.size, self.size, padding=0, stride=self.stride)

        self.max_indexes = np.argmax(self.X_col, axis=0)
        out = self.X_col[self.max_indexes, range(self.max_indexes.size)]

        out = out.reshape(self.h_out, self.w_out, self.n_X,
                          self.d_X).transpose(2, 3, 0, 1)
        return out

    def backward(self, dout):

        dX_col = np.zeros_like(self.X_col)
        # flatten the gradient
        dout_flat = dout.transpose(2, 3, 0, 1).ravel()

        dX_col[self.max_indexes, range(self.max_indexes.size)] = dout_flat

        # get the original X_reshaped structure from col2im
        shape = (self.n_X * self.d_X, 1, self.h_X, self.w_X)
        dX = col2im_indices(dX_col, shape, self.size,
                            self.size, padding=0, stride=self.stride)
        dX = dX.reshape(self.n_X, self.d_X, self.h_X, self.w_X)
        return dX, []


class Flatten():

    def __init__(self):
        self.params = []

    def forward(self, X):
        self.X_shape = X.shape
        self.out_shape = (self.X_shape[0], -1)
        out = X.ravel().reshape(self.out_shape)
        self.out_shape = self.out_shape[1]
        return out

    def backward(self, dout):
        out = dout.reshape(self.X_shape)
        return out, ()


class FullyConnected():

    def __init__(self, in_size, out_size):

        self.W = np.random.randn(in_size, out_size) / np.sqrt(in_size / 2.)
        self.b = np.zeros((1, out_size))
        self.params = [self.W, self.b]

    def forward(self, X):
        self.X = X
        out = self.X @ self.W + self.b
        return out

    def backward(self, dout):
        dW = self.X.T @ dout
        db = np.sum(dout, axis=0)
        dX = dout @ self.W.T
        return dX, [dW, db]


class Batchnorm():

    def __init__(self, X_dim):
        self.d_X, self.h_X, self.w_X = X_dim
        self.gamma = np.ones((1, int(np.prod(X_dim))))
        self.beta = np.zeros((1, int(np.prod(X_dim))))
        self.params = [self.gamma, self.beta]

    def forward(self, X):
        self.n_X = X.shape[0]
        self.X_shape = X.shape

        self.X_flat = X.ravel().reshape(self.n_X, -1)
        self.mu = np.mean(self.X_flat, axis=0)
        self.var = np.var(self.X_flat, axis=0)
        self.X_norm = (self.X_flat - self.mu) / np.sqrt(self.var + 1e-8)
        out = self.gamma * self.X_norm + self.beta

        return out.reshape(self.X_shape)

    def backward(self, dout):

        dout = dout.ravel().reshape(dout.shape[0], -1)
        X_mu = self.X_flat - self.mu
        var_inv = 1. / np.sqrt(self.var + 1e-8)

        dbeta = np.sum(dout, axis=0)
        dgamma = np.sum(dout * self.X_norm, axis=0)

        dX_norm = dout * self.gamma
        dvar = np.sum(dX_norm * X_mu, axis=0) * - \
            0.5 * (self.var + 1e-8)**(-3 / 2)
        dmu = np.sum(dX_norm * -var_inv, axis=0) + dvar * \
            1 / self.n_X * np.sum(-2. * X_mu, axis=0)
        dX = (dX_norm * var_inv) + (dmu / self.n_X) + \
            (dvar * 2 / self.n_X * X_mu)

        dX = dX.reshape(self.X_shape)
        return dX, [dgamma, dbeta]


class Dropout():

    def __init__(self, prob=0.5):
        self.prob = prob
        self.params = []

    def forward(self, X):
        self.mask = np.random.binomial(1, self.prob, size=X.shape) / self.prob
        out = X * self.mask
        return out.reshape(X.shape)

    def backward(self, dout):
        dX = dout * self.mask
        return dX, []


class ReLU():
    def __init__(self):
        self.params = []

    def forward(self, X):
        self.X = X
        return np.maximum(0, X)

    def backward(self, dout):
        dX = dout.copy()
        dX[self.X <= 0] = 0
        return dX, []


class sigmoid():
    def __init__(self):
        self.params = []

    def forward(self, X):
        out = 1.0 / (1.0 + np.exp(X))
        self.out = out
        return out

    def backward(self, dout):
        dX = dout * self.out * (1 - self.out)
        return dX, []


class tanh():
    def __init__(self):
        self.params = []

    def forward(self, X):
        out = np.tanh(X)
        self.out = out
        return out

    def backward(self, dout):
        dX = dout * (1 - self.out**2)
        return dX, []


================================================
FILE: deepnet/loss.py
================================================
import numpy as np
from deepnet.utils import softmax
from deepnet.layers import Conv, FullyConnected


def l2_regularization(layers, lam=0.001):
    reg_loss = 0.0
    for layer in layers:
        if hasattr(layer, 'W'):
            reg_loss += 0.5 * lam * np.sum(layer.W * layer.W)
    return reg_loss


def delta_l2_regularization(layers, grads, lam=0.001):
    for layer, grad in zip(layers, reversed(grads)):
        if hasattr(layer, 'W'):
            grad[0] += lam * layer.W
    return grads


def l1_regularization(layers, lam=0.001):
    reg_loss = 0.0
    for layer in layers:
        if hasattr(layer, 'W'):
            reg_loss += lam * np.sum(np.abs(layer.W))
    return reg_loss


def delta_l1_regularization(layers, grads, lam=0.001):
    for layer, grad in zip(layers, reversed(grads)):
        if hasattr(layer, 'W'):
            grad[0] += lam * layer.W / (np.abs(layer.W) + 1e-8)
    return grads


def SoftmaxLoss(X, y):
    m = y.shape[0]
    p = softmax(X)
    log_likelihood = -np.log(p[range(m), y])
    loss = np.sum(log_likelihood) / m

    dx = p.copy()
    dx[range(m), y] -= 1
    dx /= m
    return loss, dx


================================================
FILE: deepnet/nnet.py
================================================
import numpy as np
from deepnet.loss import SoftmaxLoss, l2_regularization, delta_l2_regularization
from deepnet.utils import accuracy, softmax
from deepnet.utils import one_hot_encode

class CNN:

    def __init__(self, layers, loss_func=SoftmaxLoss):
        self.layers = layers
        self.params = []
        for layer in self.layers:
            self.params.append(layer.params)
        self.loss_func = loss_func

    def forward(self, X):
        for layer in self.layers:
            X = layer.forward(X)
        return X

    def backward(self, dout):
        grads = []
        for layer in reversed(self.layers):
            dout, grad = layer.backward(dout)
            grads.append(grad)
        return grads

    def train_step(self, X, y):
        out = self.forward(X)
        loss, dout = self.loss_func(out, y)
        loss += l2_regularization(self.layers)
        grads = self.backward(dout)
        grads = delta_l2_regularization(self.layers, grads)
        return loss, grads

    def predict(self, X):
        X = self.forward(X)
        return np.argmax(softmax(X), axis=1)


class RNN:

    def __init__(self, vocab_size, h_size, char_to_idx, idx_to_char):
        self.vocab_size = vocab_size
        self.h_size = h_size
        self.char_to_idx = char_to_idx
        self.idx_to_char = idx_to_char
        self.model = dict(
            Wxh=np.random.rand(vocab_size, h_size) / np.sqrt(vocab_size / 2),
            Whh=np.random.rand(h_size, h_size) / np.sqrt(h_size / 2),
            Why=np.random.rand(h_size, vocab_size) / np.sqrt(h_size / 2),
            bh=np.zeros((1, vocab_size)),
            by=np.zeros((1, h_size))
        )
        self.initial_state = np.zeros((1, self.h_size))

    def _forward(self, X, h):
        # input to one hot
        X_onehot = np.zeros(self.vocab_size)
        X_onehot[X] = 1
        X_onehot = X_onehot.reshape(1,-1)

        h_prev = h.copy()
        # calculate hidden step with tanh
        h = np.tanh(np.dot(X,self.model['Wxh']) + np.dot(h_prev,self.model['Whh']) + self.model['bh'])

        # fully connected forward step
        y = np.dot(X, self.model['Why']) + self.model['by']

        cache = (X_onehot, h_prev)
        return y, h, cache

    def _backward(self, out, y, dh_next, cache):

        X_onehot, h_prev = cache

        # gradient of output from froward step
        dout = softmax(out)
        dout[range(len(y)), y] -= 1
        # fully connected backward step
        dWhy = X_onehot.T @ dout
        dby = np.sum(dWhy, axis=0).reshape(1, -1)
        dh = dout @ self.dWhy.T
        # gradient through tanh
        dh = dout * (1 - out**2)
        # add up gradient from previous gradient
        dh += dh_next
        # hidden state
        dbh = dh
        dWhh = h_prev.T @ dh
        dWxh = X_onehot.T @ dh
        dh_next = dh @ Whh.T

        grads = dict(Wxh=dWxh, Whh=dWhh, Why=dWhy, bh=dbh, by=dby)

        return grads, dh_next

    def train_step(self,X_train, y_train, h):
        ys, caches = [], []
        total_loss = 0
        grads = {k: np.zeros_like(v) for k, v in self.model.items()}

        # forward pass and store values for bptt
        for x, y in zip(X_train, y_train):
            y_pred, h, cache = self._forward(x, h)
            p = softmax(y_pred)
            log_likelihood = -np.log(p[range(y_pred.shape[0]), y])
            total_loss += np.sum(log_likelihood) / y_pred.shape[0]
            ys.append(y_pred)
            caches.append(cache)

        total_loss /= X_train.shape[0]

        # backprop through time
        dh_next = np.zeros((1, self.h_size))
        for t in reversed(range(len(X_train))):
            grad, dh_next = self._backward(
                ys[t], y_train[t], dh_next, caches[t])
            # sum up the gradients for each time step
            for k in grads.keys():
                grads[k] += grad[k]

        # clip vanishing/exploding gradients
        for k, v in grads.items():
            grads[k] = np.clip(v, -5.0, 5.0)

        return loss, grads, h

    def predict(self, X):
        X = self.forward(X)
        return np.argmax(softmax(X), axis=1)


================================================
FILE: deepnet/solver.py
================================================
import numpy as np
from sklearn.utils import shuffle
from deepnet.utils import accuracy
import copy
from deepnet.loss import SoftmaxLoss


def get_minibatches(X, y, minibatch_size,shuffleTag=True):
    m = X.shape[0]
    minibatches = []
    if shuffleTag:
        X, y = shuffle(X, y)
    for i in range(0, m, minibatch_size):
        X_batch = X[i:i + minibatch_size, :, :, :]
        y_batch = y[i:i + minibatch_size, ]
        minibatches.append((X_batch, y_batch))
    return minibatches


def vanilla_update(params, grads, learning_rate=0.01):
    for param, grad in zip(params, reversed(grads)):
        for i in range(len(grad)):
            param[i] += - learning_rate * grad[i]


def momentum_update(velocity, params, grads, learning_rate=0.01, mu=0.9):
    for v, param, grad, in zip(velocity, params, reversed(grads)):
        for i in range(len(grad)):
            v[i] = mu * v[i] + learning_rate * grad[i]
            param[i] -= v[i]


def adagrad_update(cache, params, grads, learning_rate=0.01):
    for c, param, grad, in zip(cache, params, reversed(grads)):
        for i in range(len(grad)):
            cache[i] += grad[i]**2
            param[i] += - learning_rate * grad[i] / (np.sqrt(cache[i]) + 1e-8)


def rmsprop_update(cache, params, grads, learning_rate=0.01, decay_rate=0.9):
    for c, param, grad, in zip(cache, params, reversed(grads)):
        for i in range(len(grad)):
            cache[i] = decay_rate * cache[i] + (1 - decay_rate) * grad[i]**2
            param[i] += - learning_rate * grad[i] / (np.sqrt(cache[i]) + 1e-4)


def sgd(nnet, X_train, y_train, minibatch_size, epoch, learning_rate, verbose=True,
        X_test=None, y_test=None):
    minibatches = get_minibatches(X_train, y_train, minibatch_size)
    for i in range(epoch):
        loss = 0
        if verbose:
            print("Epoch {0}".format(i + 1))
        for X_mini, y_mini in minibatches:
            loss, grads = nnet.train_step(X_mini, y_mini)
            vanilla_update(nnet.params, grads, learning_rate=learning_rate)
        if verbose:
            train_acc = accuracy(y_train, nnet.predict(X_train))
            test_acc = accuracy(y_test, nnet.predict(X_test))
            print("Loss = {0} | Training Accuracy = {1} | Test Accuracy = {2}".format(
                loss, train_acc, test_acc))
    return nnet

def sgd_rnn(nnet, X_train, y_train, minibatch_size, epoch, learning_rate, verbose=True):
    for i in range(epoch):
        loss = 0
        if verbose:
            print("Epoch {0}".format(i + 1))
        hidden_state = nnet.initial_state
        loss, grads, hidden_state = nnet.train_step(X_train, y_train, hidden_state)

        for k in grads.keys():
            nnet.model[k] -= learning_rate * grads[k]
        
        if verbose:
            print("Loss = {0}".format(loss))
    return nnet


def sgd_momentum(nnet, X_train, y_train, minibatch_size, epoch, learning_rate, mu=0.9,
                 verbose=True, X_test=None, y_test=None, nesterov=True):

    minibatches = get_minibatches(X_train, y_train, minibatch_size)

    for i in range(epoch):
        loss = 0
        velocity = []
        for param_layer in nnet.params:
            p = [np.zeros_like(param) for param in list(param_layer)]
            velocity.append(p)

        if verbose:
            print("Epoch {0}".format(i + 1))

        for X_mini, y_mini in minibatches:

            if nesterov:
                for param, ve in zip(nnet.params, velocity):
                    for i in range(len(param)):
                        param[i] += mu * ve[i]

            loss, grads = nnet.train_step(X_mini, y_mini)
            momentum_update(velocity, nnet.params, grads,
                            learning_rate=learning_rate, mu=mu)

        if verbose:
            m_train = X_train.shape[0]
            m_test = X_test.shape[0]
            y_train_pred = np.array([], dtype="int64")
            y_test_pred = np.array([], dtype="int64")
            for i in range(0, m_train, minibatch_size):
                X_tr = X_train[i:i + minibatch_size, :, :, :]
                y_tr = y_train[i:i + minibatch_size, ]
                y_train_pred = np.append(y_train_pred, nnet.predict(X_tr))
            for i in range(0, m_test, minibatch_size):
                X_te = X_test[i:i + minibatch_size, :, :, :]
                y_te = y_test[i:i + minibatch_size, ]
                y_test_pred = np.append(y_test_pred, nnet.predict(X_te))

            train_acc = accuracy(y_train, y_train_pred)
            test_acc = accuracy(y_test, y_test_pred)
            print("Loss = {0} | Training Accuracy = {1} | Test Accuracy = {2}".format(
                loss, train_acc, test_acc))
    return nnet


def adam(nnet, X_train, y_train, minibatch_size, epoch, learning_rate, verbose=True,
         X_test=None, y_test=None):
    beta1 = 0.9
    beta2 = 0.999
    minibatches = get_minibatches(X_train, y_train, minibatch_size)
    for i in range(epoch):
        loss = 0
        velocity, cache = [], []
        for param_layer in nnet.params:
            p = [np.zeros_like(param) for param in list(param_layer)]
            velocity.append(p)
            cache.append(p)
        if verbose:
            print("Epoch {0}".format(i + 1))
        t = 1
        for X_mini, y_mini in minibatches:
            loss, grads = nnet.train_step(X_mini, y_mini)
            for c, v, param, grad, in zip(cache, velocity, nnet.params, reversed(grads)):
                for i in range(len(grad)):
                    c[i] = beta1 * c[i] + (1. - beta1) * grad[i]
                    v[i] = beta2 * v[i] + (1. - beta2) * (grad[i]**2)
                    mt = c[i] / (1. - beta1**(t))
                    vt = v[i] / (1. - beta2**(t))
                    param[i] += - learning_rate * mt / (np.sqrt(vt) + 1e-4)
            t += 1

        if verbose:
            train_acc = accuracy(y_train, nnet.predict(X_train))
            test_acc = accuracy(y_test, nnet.predict(X_test))
            print("Loss = {0} | Training Accuracy = {1} | Test Accuracy = {2}".format(
                loss, train_acc, test_acc))
    return nnet


================================================
FILE: deepnet/utils.py
================================================
import numpy as np
import _pickle as cPickle
import gzip
import os


def one_hot_encode(y, num_class):
    m = y.shape[0]
    onehot = np.zeros((m, num_class), dtype="int32")
    for i in range(m):
        idx = y[i]
        onehot[i][idx] = 1
    return onehot


def accuracy(y_true, y_pred):
    return np.mean(y_pred == y_true)  # both are not one hot encoded


def softmax(x):
    exp_x = np.exp(x - np.max(x, axis=1, keepdims=True))
    return exp_x / np.sum(exp_x, axis=1, keepdims=True)


def load_mnist(path, num_training=50000, num_test=10000, cnn=True, one_hot=False):
    f = gzip.open(path, 'rb')
    training_data, validation_data, test_data = cPickle.load(
        f, encoding='iso-8859-1')
    f.close()
    X_train, y_train = training_data
    X_validation, y_validation = validation_data
    X_test, y_test = test_data
    if cnn:
        shape = (-1, 1, 28, 28)
        X_train = X_train.reshape(shape)
        X_validation = X_validation.reshape(shape)
        X_test = X_test.reshape(shape)
    if one_hot:
        y_train = one_hot_encode(y_train, 10)
        y_validation = one_hot_encode(y_validation, 10)
        y_test = one_hot_encode(y_test, 10)
    X_train, y_train = X_train[range(
        num_training)], y_train[range(num_training)]
    X_test, y_test = X_test[range(num_test)], y_test[range(num_test)]
    return (X_train, y_train), (X_test, y_test)


def load_cifar10(path, num_training=1000, num_test=1000):
    Xs, ys = [], []
    for batch in range(1, 6):
        f = open(os.path.join(path, "data_batch_{0}".format(batch)), 'rb')
        data = cPickle.load(f, encoding='iso-8859-1')
        f.close()
        X = data["data"].reshape(10000, 3, 32, 32).astype("float64")
        y = np.array(data["labels"])
        Xs.append(X)
        ys.append(y)
    f = open(os.path.join(CIFAR10_PATH, "test_batch"), 'rb')
    data = cPickle.load(f, encoding='iso-8859-1')
    f.close()
    X_train, y_train = np.concatenate(Xs), np.concatenate(ys)
    X_test = data["data"].reshape(10000, 3, 32, 32).astype("float")
    y_test = np.array(data["labels"])
    X_train, y_train = X_train[range(
        num_training)], y_train[range(num_training)]
    X_test, y_test = X_test[range(num_test)], y_test[range(num_test)]
    mean = np.mean(X_train, axis=0)
    std = np.std(X_train)
    X_train /= 255.0
    X_test /= 255.0
    return (X_train, y_train), (X_test, y_test)


================================================
FILE: requirements.txt
================================================
numpy==1.11.3
scipy==0.16.1
matplotlib==1.5.0
ipykernel==4.2.2
ipython==4.0.1
ipython-genutils==0.1.0
ipywidgets==4.1.1

================================================
FILE: run_cnn.py
================================================
import numpy as np
from deepnet.utils import load_mnist, load_cifar10
from deepnet.layers import *
from deepnet.solver import sgd, sgd_momentum, adam
from deepnet.nnet import CNN
import sys


def make_mnist_cnn(X_dim, num_class):
    conv = Conv(X_dim, n_filter=32, h_filter=3,
                w_filter=3, stride=1, padding=1)
    relu_conv = ReLU()
    maxpool = Maxpool(conv.out_dim, size=2, stride=1)
    flat = Flatten()
    fc = FullyConnected(np.prod(maxpool.out_dim), num_class)
    return [conv, relu_conv, maxpool, flat, fc]


def make_cifar10_cnn(X_dim, num_class):
    conv = Conv(X_dim, n_filter=16, h_filter=5,
                w_filter=5, stride=1, padding=2)
    relu = ReLU()
    maxpool = Maxpool(conv.out_dim, size=2, stride=2)
    conv2 = Conv(maxpool.out_dim, n_filter=20, h_filter=5,
                 w_filter=5, stride=1, padding=2)
    relu2 = ReLU()
    maxpool2 = Maxpool(conv2.out_dim, size=2, stride=2)
    flat = Flatten()
    fc = FullyConnected(np.prod(maxpool2.out_dim), num_class)
    return [conv, relu, maxpool, conv2, relu2, maxpool2, flat, fc]


if __name__ == "__main__":

    if sys.argv[1] == "mnist":

        training_set, test_set = load_mnist(
            'data/mnist.pkl.gz', num_training=1000, num_test=1000)
        X, y = training_set
        X_test, y_test = test_set
        mnist_dims = (1, 28, 28)
        cnn = CNN(make_mnist_cnn(mnist_dims, num_class=10))
        cnn = sgd_momentum(cnn, X, y, minibatch_size=35, epoch=20,
                           learning_rate=0.01, X_test=X_test, y_test=y_test)

    if sys.argv[1] == "cifar10":
        training_set, test_set = load_cifar10(
            'data/cifar-10', num_training=1000, num_test=100)
        X, y = training_set
        X_test, y_test = test_set
        cifar10_dims = (3, 32, 32)
        cnn = CNN(make_cifar10_cnn(cifar10_dims, num_class=10))
        cnn = sgd_momentum(cnn, X, y, minibatch_size=10, epoch=200,
                           learning_rate=0.01, X_test=X_test, y_test=y_test)


================================================
FILE: run_rnn.py
================================================
import numpy as np
from deepnet.nnet import RNN
from deepnet.solver import sgd_rnn


def text_to_inputs(path):
    """
    Converts the given text into X and y vectors
    X : contains the index of all the characters in the text vocab
    y : y[i] contains the index of next character for X[i] in the text vocab
    """
    with open(path) as f:
        txt = f.read()
        X, y = [], []

        char_to_idx = {char: i for i, char in enumerate(set(txt))}
        idx_to_char = {i: char for i, char in enumerate(set(txt))}
        X = np.array([char_to_idx[i] for i in txt])
        y = [char_to_idx[i] for i in txt[1:]]
        y.append(char_to_idx['.'])
        y = np.array(y)

        vocab_size = len(char_to_idx)
        return X, y, vocab_size, char_to_idx, idx_to_char


if __name__ == "__main__":

    X, y, vocab_size, char_to_idx, idx_to_char = text_to_inputs('data/Rnn.txt')
    rnn = RNN(vocab_size,vocab_size,char_to_idx,idx_to_char)
    rnn = sgd_rnn(rnn,X,y,10,10,0.1)

Download .txt

gitextract_e4h7xye1/

├── .gitignore
├── LICENSE
├── README.md
├── deepnet/
│   ├── Gradient Checking.ipynb
│   ├── im2col.py
│   ├── layers.py
│   ├── loss.py
│   ├── nnet.py
│   ├── solver.py
│   └── utils.py
├── requirements.txt
├── run_cnn.py
└── run_rnn.py

Download .txt

SYMBOL INDEX (73 symbols across 8 files)

FILE: deepnet/im2col.py
  function get_im2col_indices (line 4) | def get_im2col_indices(x_shape, field_height=3, field_width=3, padding=1...
  function im2col_indices (line 24) | def im2col_indices(x, field_height=3, field_width=3, padding=1, stride=1):
  function col2im_indices (line 39) | def col2im_indices(cols, x_shape, field_height=3, field_width=3, padding=1,

FILE: deepnet/layers.py
  class Conv (line 5) | class Conv():
    method __init__ (line 7) | def __init__(self, X_dim, n_filter, h_filter, w_filter, stride, padding):
    method forward (line 28) | def forward(self, X):
    method backward (line 41) | def backward(self, dout):
  class Maxpool (line 60) | class Maxpool():
    method __init__ (line 62) | def __init__(self, X_dim, size, stride):
    method forward (line 80) | def forward(self, X):
    method backward (line 95) | def backward(self, dout):
  class Flatten (line 111) | class Flatten():
    method __init__ (line 113) | def __init__(self):
    method forward (line 116) | def forward(self, X):
    method backward (line 123) | def backward(self, dout):
  class FullyConnected (line 128) | class FullyConnected():
    method __init__ (line 130) | def __init__(self, in_size, out_size):
    method forward (line 136) | def forward(self, X):
    method backward (line 141) | def backward(self, dout):
  class Batchnorm (line 148) | class Batchnorm():
    method __init__ (line 150) | def __init__(self, X_dim):
    method forward (line 156) | def forward(self, X):
    method backward (line 168) | def backward(self, dout):
  class Dropout (line 189) | class Dropout():
    method __init__ (line 191) | def __init__(self, prob=0.5):
    method forward (line 195) | def forward(self, X):
    method backward (line 200) | def backward(self, dout):
  class ReLU (line 205) | class ReLU():
    method __init__ (line 206) | def __init__(self):
    method forward (line 209) | def forward(self, X):
    method backward (line 213) | def backward(self, dout):
  class sigmoid (line 219) | class sigmoid():
    method __init__ (line 220) | def __init__(self):
    method forward (line 223) | def forward(self, X):
    method backward (line 228) | def backward(self, dout):
  class tanh (line 233) | class tanh():
    method __init__ (line 234) | def __init__(self):
    method forward (line 237) | def forward(self, X):
    method backward (line 242) | def backward(self, dout):

FILE: deepnet/loss.py
  function l2_regularization (line 6) | def l2_regularization(layers, lam=0.001):
  function delta_l2_regularization (line 14) | def delta_l2_regularization(layers, grads, lam=0.001):
  function l1_regularization (line 21) | def l1_regularization(layers, lam=0.001):
  function delta_l1_regularization (line 29) | def delta_l1_regularization(layers, grads, lam=0.001):
  function SoftmaxLoss (line 36) | def SoftmaxLoss(X, y):

FILE: deepnet/nnet.py
  class CNN (line 6) | class CNN:
    method __init__ (line 8) | def __init__(self, layers, loss_func=SoftmaxLoss):
    method forward (line 15) | def forward(self, X):
    method backward (line 20) | def backward(self, dout):
    method train_step (line 27) | def train_step(self, X, y):
    method predict (line 35) | def predict(self, X):
  class RNN (line 40) | class RNN:
    method __init__ (line 42) | def __init__(self, vocab_size, h_size, char_to_idx, idx_to_char):
    method _forward (line 56) | def _forward(self, X, h):
    method _backward (line 72) | def _backward(self, out, y, dh_next, cache):
    method train_step (line 97) | def train_step(self,X_train, y_train, h):
    method predict (line 128) | def predict(self, X):

FILE: deepnet/solver.py
  function get_minibatches (line 8) | def get_minibatches(X, y, minibatch_size,shuffleTag=True):
  function vanilla_update (line 20) | def vanilla_update(params, grads, learning_rate=0.01):
  function momentum_update (line 26) | def momentum_update(velocity, params, grads, learning_rate=0.01, mu=0.9):
  function adagrad_update (line 33) | def adagrad_update(cache, params, grads, learning_rate=0.01):
  function rmsprop_update (line 40) | def rmsprop_update(cache, params, grads, learning_rate=0.01, decay_rate=...
  function sgd (line 47) | def sgd(nnet, X_train, y_train, minibatch_size, epoch, learning_rate, ve...
  function sgd_rnn (line 64) | def sgd_rnn(nnet, X_train, y_train, minibatch_size, epoch, learning_rate...
  function sgd_momentum (line 80) | def sgd_momentum(nnet, X_train, y_train, minibatch_size, epoch, learning...
  function adam (line 127) | def adam(nnet, X_train, y_train, minibatch_size, epoch, learning_rate, v...

FILE: deepnet/utils.py
  function one_hot_encode (line 7) | def one_hot_encode(y, num_class):
  function accuracy (line 16) | def accuracy(y_true, y_pred):
  function softmax (line 20) | def softmax(x):
  function load_mnist (line 25) | def load_mnist(path, num_training=50000, num_test=10000, cnn=True, one_h...
  function load_cifar10 (line 48) | def load_cifar10(path, num_training=1000, num_test=1000):

FILE: run_cnn.py
  function make_mnist_cnn (line 9) | def make_mnist_cnn(X_dim, num_class):
  function make_cifar10_cnn (line 19) | def make_cifar10_cnn(X_dim, num_class):

FILE: run_rnn.py
  function text_to_inputs (line 6) | def text_to_inputs(path):

Download .json

Condensed preview — 13 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (49K chars).

[
  {
    "path": ".gitignore",
    "chars": 1068,
    "preview": "# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packagi"
  },
  {
    "path": "LICENSE",
    "chars": 1077,
    "preview": "The MIT License (MIT)\n\nCopyright (c) 2016 Paras Dahal\n\nPermission is hereby granted, free of charge, to any person obtai"
  },
  {
    "path": "README.md",
    "chars": 988,
    "preview": "# deepnet\n\nImplementations of CNNs, RNNs and cool new techniques in deep learning\n\nNote: deepnet is a work in progress a"
  },
  {
    "path": "deepnet/Gradient Checking.ipynb",
    "chars": 16114,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Numerical Gradient checking of La"
  },
  {
    "path": "deepnet/im2col.py",
    "chars": 2157,
    "preview": "import numpy as np\n\n\ndef get_im2col_indices(x_shape, field_height=3, field_width=3, padding=1, stride=1):\n  # First figu"
  },
  {
    "path": "deepnet/layers.py",
    "chars": 6865,
    "preview": "import numpy as np\nfrom deepnet.im2col import *\n\n\nclass Conv():\n\n    def __init__(self, X_dim, n_filter, h_filter, w_fil"
  },
  {
    "path": "deepnet/loss.py",
    "chars": 1138,
    "preview": "import numpy as np\nfrom deepnet.utils import softmax\nfrom deepnet.layers import Conv, FullyConnected\n\n\ndef l2_regulariza"
  },
  {
    "path": "deepnet/nnet.py",
    "chars": 4129,
    "preview": "import numpy as np\nfrom deepnet.loss import SoftmaxLoss, l2_regularization, delta_l2_regularization\nfrom deepnet.utils i"
  },
  {
    "path": "deepnet/solver.py",
    "chars": 6107,
    "preview": "import numpy as np\nfrom sklearn.utils import shuffle\nfrom deepnet.utils import accuracy\nimport copy\nfrom deepnet.loss im"
  },
  {
    "path": "deepnet/utils.py",
    "chars": 2392,
    "preview": "import numpy as np\nimport _pickle as cPickle\nimport gzip\nimport os\n\n\ndef one_hot_encode(y, num_class):\n    m = y.shape[0"
  },
  {
    "path": "requirements.txt",
    "chars": 119,
    "preview": "numpy==1.11.3\nscipy==0.16.1\nmatplotlib==1.5.0\nipykernel==4.2.2\nipython==4.0.1\nipython-genutils==0.1.0\nipywidgets==4.1.1"
  },
  {
    "path": "run_cnn.py",
    "chars": 2001,
    "preview": "import numpy as np\nfrom deepnet.utils import load_mnist, load_cifar10\nfrom deepnet.layers import *\nfrom deepnet.solver i"
  },
  {
    "path": "run_rnn.py",
    "chars": 996,
    "preview": "import numpy as np\nfrom deepnet.nnet import RNN\nfrom deepnet.solver import sgd_rnn\n\n\ndef text_to_inputs(path):\n    \"\"\"\n "
  }
]

About this extraction

This page contains the full source code of the parasdahal/deepnet GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 13 files (44.1 KB), approximately 14.0k tokens, and a symbol index with 73 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo