[
  {
    "path": ".gitignore",
    "content": "*~\n\\.DS_STORE\nbuild/\ndist/\n*egg-info*\n*__pycache__/\n*.py[cod]\n*eggs*\n*\\.png\n"
  },
  {
    "path": "LICENSE",
    "content": "MIT License\n\nCopyright (c) 2017 Erik Linder-Norén\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE."
  },
  {
    "path": "MANIFEST.in",
    "content": "recursive-include mlfs.data *"
  },
  {
    "path": "README.md",
    "content": "# Machine Learning From Scratch\n\n## About\nPython implementations of some of the fundamental Machine Learning models and algorithms from scratch.\n\nThe purpose of this project is not to produce as optimized and computationally efficient algorithms as possible\nbut rather to present the inner workings of them in a transparent and accessible way.\n\n## Table of Contents\n- [Machine Learning From Scratch](#machine-learning-from-scratch)\n  * [About](#about)\n  * [Table of Contents](#table-of-contents)\n  * [Installation](#installation)\n  * [Examples](#examples)\n    + [Polynomial Regression](#polynomial-regression)\n    + [Classification With CNN](#classification-with-cnn)\n    + [Density-Based Clustering](#density-based-clustering)\n    + [Generating Handwritten Digits](#generating-handwritten-digits)\n    + [Deep Reinforcement Learning](#deep-reinforcement-learning)\n    + [Image Reconstruction With RBM](#image-reconstruction-with-rbm)\n    + [Evolutionary Evolved Neural Network](#evolutionary-evolved-neural-network)\n    + [Genetic Algorithm](#genetic-algorithm)\n    + [Association Analysis](#association-analysis)\n  * [Implementations](#implementations)\n    + [Supervised Learning](#supervised-learning)\n    + [Unsupervised Learning](#unsupervised-learning)\n    + [Reinforcement Learning](#reinforcement-learning)\n    + [Deep Learning](#deep-learning)\n  * [Contact](#contact)\n\n## Installation\n    $ git clone https://github.com/eriklindernoren/ML-From-Scratch\n    $ cd ML-From-Scratch\n    $ python setup.py install\n\n## Examples\n### Polynomial Regression\n    $ python mlfromscratch/examples/polynomial_regression.py\n\n<p align=\"center\">\n    <img src=\"http://eriklindernoren.se/images/p_reg.gif\" width=\"640\"\\>\n</p>\n<p align=\"center\">\n    Figure: Training progress of a regularized polynomial regression model fitting <br>\n    temperature data measured in Linköping, Sweden 2016.\n</p>\n\n### Classification With CNN\n    $ python mlfromscratch/examples/convolutional_neural_network.py\n\n    +---------+\n    | ConvNet |\n    +---------+\n    Input Shape: (1, 8, 8)\n    +----------------------+------------+--------------+\n    | Layer Type           | Parameters | Output Shape |\n    +----------------------+------------+--------------+\n    | Conv2D               | 160        | (16, 8, 8)   |\n    | Activation (ReLU)    | 0          | (16, 8, 8)   |\n    | Dropout              | 0          | (16, 8, 8)   |\n    | BatchNormalization   | 2048       | (16, 8, 8)   |\n    | Conv2D               | 4640       | (32, 8, 8)   |\n    | Activation (ReLU)    | 0          | (32, 8, 8)   |\n    | Dropout              | 0          | (32, 8, 8)   |\n    | BatchNormalization   | 4096       | (32, 8, 8)   |\n    | Flatten              | 0          | (2048,)      |\n    | Dense                | 524544     | (256,)       |\n    | Activation (ReLU)    | 0          | (256,)       |\n    | Dropout              | 0          | (256,)       |\n    | BatchNormalization   | 512        | (256,)       |\n    | Dense                | 2570       | (10,)        |\n    | Activation (Softmax) | 0          | (10,)        |\n    +----------------------+------------+--------------+\n    Total Parameters: 538570\n\n    Training: 100% [------------------------------------------------------------------------] Time: 0:01:55\n    Accuracy: 0.987465181058\n\n<p align=\"center\">\n    <img src=\"http://eriklindernoren.se/images/mlfs_cnn1.png\" width=\"640\">\n</p>\n<p align=\"center\">\n    Figure: Classification of the digit dataset using CNN.\n</p>\n\n### Density-Based Clustering\n    $ python mlfromscratch/examples/dbscan.py\n\n<p align=\"center\">\n    <img src=\"http://eriklindernoren.se/images/mlfs_dbscan.png\" width=\"640\">\n</p>\n<p align=\"center\">\n    Figure: Clustering of the moons dataset using DBSCAN.\n</p>\n\n### Generating Handwritten Digits\n    $ python mlfromscratch/unsupervised_learning/generative_adversarial_network.py\n\n    +-----------+\n    | Generator |\n    +-----------+\n    Input Shape: (100,)\n    +------------------------+------------+--------------+\n    | Layer Type             | Parameters | Output Shape |\n    +------------------------+------------+--------------+\n    | Dense                  | 25856      | (256,)       |\n    | Activation (LeakyReLU) | 0          | (256,)       |\n    | BatchNormalization     | 512        | (256,)       |\n    | Dense                  | 131584     | (512,)       |\n    | Activation (LeakyReLU) | 0          | (512,)       |\n    | BatchNormalization     | 1024       | (512,)       |\n    | Dense                  | 525312     | (1024,)      |\n    | Activation (LeakyReLU) | 0          | (1024,)      |\n    | BatchNormalization     | 2048       | (1024,)      |\n    | Dense                  | 803600     | (784,)       |\n    | Activation (TanH)      | 0          | (784,)       |\n    +------------------------+------------+--------------+\n    Total Parameters: 1489936\n\n    +---------------+\n    | Discriminator |\n    +---------------+\n    Input Shape: (784,)\n    +------------------------+------------+--------------+\n    | Layer Type             | Parameters | Output Shape |\n    +------------------------+------------+--------------+\n    | Dense                  | 401920     | (512,)       |\n    | Activation (LeakyReLU) | 0          | (512,)       |\n    | Dropout                | 0          | (512,)       |\n    | Dense                  | 131328     | (256,)       |\n    | Activation (LeakyReLU) | 0          | (256,)       |\n    | Dropout                | 0          | (256,)       |\n    | Dense                  | 514        | (2,)         |\n    | Activation (Softmax)   | 0          | (2,)         |\n    +------------------------+------------+--------------+\n    Total Parameters: 533762\n\n\n<p align=\"center\">\n    <img src=\"http://eriklindernoren.se/images/gan_mnist5.gif\" width=\"640\">\n</p>\n<p align=\"center\">\n    Figure: Training progress of a Generative Adversarial Network generating <br>\n    handwritten digits.\n</p>\n\n### Deep Reinforcement Learning\n    $ python mlfromscratch/examples/deep_q_network.py\n\n    +----------------+\n    | Deep Q-Network |\n    +----------------+\n    Input Shape: (4,)\n    +-------------------+------------+--------------+\n    | Layer Type        | Parameters | Output Shape |\n    +-------------------+------------+--------------+\n    | Dense             | 320        | (64,)        |\n    | Activation (ReLU) | 0          | (64,)        |\n    | Dense             | 130        | (2,)         |\n    +-------------------+------------+--------------+\n    Total Parameters: 450\n\n<p align=\"center\">\n    <img src=\"http://eriklindernoren.se/images/mlfs_dql1.gif\" width=\"640\">\n</p>\n<p align=\"center\">\n    Figure: Deep Q-Network solution to the CartPole-v1 environment in OpenAI gym.\n</p>\n\n### Image Reconstruction With RBM\n    $ python mlfromscratch/examples/restricted_boltzmann_machine.py\n\n<p align=\"center\">\n    <img src=\"http://eriklindernoren.se/images/rbm_digits1.gif\" width=\"640\">\n</p>\n<p align=\"center\">\n    Figure: Shows how the network gets better during training at reconstructing <br>\n    the digit 2 in the MNIST dataset.\n</p>\n\n### Evolutionary Evolved Neural Network\n    $ python mlfromscratch/examples/neuroevolution.py\n\n    +---------------+\n    | Model Summary |\n    +---------------+\n    Input Shape: (64,)\n    +----------------------+------------+--------------+\n    | Layer Type           | Parameters | Output Shape |\n    +----------------------+------------+--------------+\n    | Dense                | 1040       | (16,)        |\n    | Activation (ReLU)    | 0          | (16,)        |\n    | Dense                | 170        | (10,)        |\n    | Activation (Softmax) | 0          | (10,)        |\n    +----------------------+------------+--------------+\n    Total Parameters: 1210\n\n    Population Size: 100\n    Generations: 3000\n    Mutation Rate: 0.01\n\n    [0 Best Individual - Fitness: 3.08301, Accuracy: 10.5%]\n    [1 Best Individual - Fitness: 3.08746, Accuracy: 12.0%]\n    ...\n    [2999 Best Individual - Fitness: 94.08513, Accuracy: 98.5%]\n    Test set accuracy: 96.7%\n\n<p align=\"center\">\n    <img src=\"http://eriklindernoren.se/images/evo_nn4.png\" width=\"640\">\n</p>\n<p align=\"center\">\n    Figure: Classification of the digit dataset by a neural network which has<br>\n    been evolutionary evolved.\n</p>\n\n### Genetic Algorithm\n    $ python mlfromscratch/examples/genetic_algorithm.py\n\n    +--------+\n    |   GA   |\n    +--------+\n    Description: Implementation of a Genetic Algorithm which aims to produce\n    the user specified target string. This implementation calculates each\n    candidate's fitness based on the alphabetical distance between the candidate\n    and the target. A candidate is selected as a parent with probabilities proportional\n    to the candidate's fitness. Reproduction is implemented as a single-point\n    crossover between pairs of parents. Mutation is done by randomly assigning\n    new characters with uniform probability.\n\n    Parameters\n    ----------\n    Target String: 'Genetic Algorithm'\n    Population Size: 100\n    Mutation Rate: 0.05\n\n    [0 Closest Candidate: 'CJqlJguPlqzvpoJmb', Fitness: 0.00]\n    [1 Closest Candidate: 'MCxZxdr nlfiwwGEk', Fitness: 0.01]\n    [2 Closest Candidate: 'MCxZxdm nlfiwwGcx', Fitness: 0.01]\n    [3 Closest Candidate: 'SmdsAklMHn kBIwKn', Fitness: 0.01]\n    [4 Closest Candidate: '  lotneaJOasWfu Z', Fitness: 0.01]\n    ...\n    [292 Closest Candidate: 'GeneticaAlgorithm', Fitness: 1.00]\n    [293 Closest Candidate: 'GeneticaAlgorithm', Fitness: 1.00]\n    [294 Answer: 'Genetic Algorithm']\n\n### Association Analysis\n    $ python mlfromscratch/examples/apriori.py\n    +-------------+\n    |   Apriori   |\n    +-------------+\n    Minimum Support: 0.25\n    Minimum Confidence: 0.8\n    Transactions:\n        [1, 2, 3, 4]\n        [1, 2, 4]\n        [1, 2]\n        [2, 3, 4]\n        [2, 3]\n        [3, 4]\n        [2, 4]\n    Frequent Itemsets:\n        [1, 2, 3, 4, [1, 2], [1, 4], [2, 3], [2, 4], [3, 4], [1, 2, 4], [2, 3, 4]]\n    Rules:\n        1 -> 2 (support: 0.43, confidence: 1.0)\n        4 -> 2 (support: 0.57, confidence: 0.8)\n        [1, 4] -> 2 (support: 0.29, confidence: 1.0)\n\n\n## Implementations\n### Supervised Learning\n- [Adaboost](mlfromscratch/supervised_learning/adaboost.py)\n- [Bayesian Regression](mlfromscratch/supervised_learning/bayesian_regression.py)\n- [Decision Tree](mlfromscratch/supervised_learning/decision_tree.py)\n- [Elastic Net](mlfromscratch/supervised_learning/regression.py)\n- [Gradient Boosting](mlfromscratch/supervised_learning/gradient_boosting.py)\n- [K Nearest Neighbors](mlfromscratch/supervised_learning/k_nearest_neighbors.py)\n- [Lasso Regression](mlfromscratch/supervised_learning/regression.py)\n- [Linear Discriminant Analysis](mlfromscratch/supervised_learning/linear_discriminant_analysis.py)\n- [Linear Regression](mlfromscratch/supervised_learning/regression.py)\n- [Logistic Regression](mlfromscratch/supervised_learning/logistic_regression.py)\n- [Multi-class Linear Discriminant Analysis](mlfromscratch/supervised_learning/multi_class_lda.py)\n- [Multilayer Perceptron](mlfromscratch/supervised_learning/multilayer_perceptron.py)\n- [Naive Bayes](mlfromscratch/supervised_learning/naive_bayes.py)\n- [Neuroevolution](mlfromscratch/supervised_learning/neuroevolution.py)\n- [Particle Swarm Optimization of Neural Network](mlfromscratch/supervised_learning/particle_swarm_optimization.py)\n- [Perceptron](mlfromscratch/supervised_learning/perceptron.py)\n- [Polynomial Regression](mlfromscratch/supervised_learning/regression.py)\n- [Random Forest](mlfromscratch/supervised_learning/random_forest.py)\n- [Ridge Regression](mlfromscratch/supervised_learning/regression.py)\n- [Support Vector Machine](mlfromscratch/supervised_learning/support_vector_machine.py)\n- [XGBoost](mlfromscratch/supervised_learning/xgboost.py)\n\n### Unsupervised Learning\n- [Apriori](mlfromscratch/unsupervised_learning/apriori.py)\n- [Autoencoder](mlfromscratch/unsupervised_learning/autoencoder.py)\n- [DBSCAN](mlfromscratch/unsupervised_learning/dbscan.py)\n- [FP-Growth](mlfromscratch/unsupervised_learning/fp_growth.py)\n- [Gaussian Mixture Model](mlfromscratch/unsupervised_learning/gaussian_mixture_model.py)\n- [Generative Adversarial Network](mlfromscratch/unsupervised_learning/generative_adversarial_network.py)\n- [Genetic Algorithm](mlfromscratch/unsupervised_learning/genetic_algorithm.py)\n- [K-Means](mlfromscratch/unsupervised_learning/k_means.py)\n- [Partitioning Around Medoids](mlfromscratch/unsupervised_learning/partitioning_around_medoids.py)\n- [Principal Component Analysis](mlfromscratch/unsupervised_learning/principal_component_analysis.py)\n- [Restricted Boltzmann Machine](mlfromscratch/unsupervised_learning/restricted_boltzmann_machine.py)\n\n### Reinforcement Learning\n- [Deep Q-Network](mlfromscratch/reinforcement_learning/deep_q_network.py)\n\n### Deep Learning\n  + [Neural Network](mlfromscratch/deep_learning/neural_network.py)\n  + [Layers](mlfromscratch/deep_learning/layers.py)\n    * Activation Layer\n    * Average Pooling Layer\n    * Batch Normalization Layer\n    * Constant Padding Layer\n    * Convolutional Layer\n    * Dropout Layer\n    * Flatten Layer\n    * Fully-Connected (Dense) Layer\n    * Fully-Connected RNN Layer\n    * Max Pooling Layer\n    * Reshape Layer\n    * Up Sampling Layer\n    * Zero Padding Layer\n  + Model Types\n    * [Convolutional Neural Network](mlfromscratch/examples/convolutional_neural_network.py)\n    * [Multilayer Perceptron](mlfromscratch/examples/multilayer_perceptron.py)\n    * [Recurrent Neural Network](mlfromscratch/examples/recurrent_neural_network.py)\n\n## Contact\nIf there's some implementation you would like to see here or if you're just feeling social,\nfeel free to [email](mailto:eriklindernoren@gmail.com) me or connect with me on [LinkedIn](https://www.linkedin.com/in/eriklindernoren/).\n"
  },
  {
    "path": "mlfromscratch/__init__.py",
    "content": ""
  },
  {
    "path": "mlfromscratch/data/TempLinkoping2016.txt",
    "content": "time\ttemp\n0.00273224\t0.1\n0.005464481\t-4.5\n0.008196721\t-6.3\n0.010928962\t-9.6\n0.013661202\t-9.9\n0.016393443\t-17.1\n0.019125683\t-11.6\n0.021857923\t-6.2\n0.024590164\t-6.4\n0.027322404\t-0.5\n0.030054645\t0.5\n0.032786885\t-2.4\n0.035519126\t-7.5\n0.038251366\t-16.8\n0.040983607\t-16.6\n0.043715847\t-14.6\n0.046448087\t-9.6\n0.049180328\t-5.8\n0.051912568\t-8.6\n0.054644809\t-9.0\n0.057377049\t-9.7\n0.06010929\t-6.9\n0.06284153\t-3.9\n0.06557377\t1.4\n0.068306011\t1.9\n0.071038251\t4.3\n0.073770492\t6.9\n0.076502732\t4.3\n0.079234973\t5.9\n0.081967213\t3.8\n0.084699454\t1.5\n0.087431694\t0.1\n0.090163934\t4.6\n0.092896175\t0.8\n0.095628415\t-0.5\n0.098360656\t-1.0\n0.101092896\t4.2\n0.103825137\t6.6\n0.106557377\t4.8\n0.109289617\t4.7\n0.112021858\t1.3\n0.114754098\t0.9\n0.117486339\t-2.8\n0.120218579\t-3.3\n0.12295082\t-5.3\n0.12568306\t-6.8\n0.128415301\t-5.1\n0.131147541\t-2.6\n0.133879781\t-0.5\n0.136612022\t-0.5\n0.139344262\t0.1\n0.142076503\t1.7\n0.144808743\t2.4\n0.147540984\t-0.9\n0.150273224\t-1.3\n0.153005464\t-1.4\n0.155737705\t-0.1\n0.158469945\t-0.7\n0.161202186\t-2.6\n0.163934426\t-4.1\n0.166666667\t-2.7\n0.169398907\t0.7\n0.172131148\t2.0\n0.174863388\t1.7\n0.177595628\t0.9\n0.180327869\t0.3\n0.183060109\t0.9\n0.18579235\t1.1\n0.18852459\t0.1\n0.191256831\t-0.9\n0.193989071\t0.2\n0.196721311\t0.1\n0.199453552\t1.0\n0.202185792\t3.4\n0.204918033\t5.2\n0.207650273\t4.9\n0.210382514\t4.9\n0.213114754\t2.2\n0.215846995\t2.9\n0.218579235\t5.3\n0.221311475\t3.7\n0.224043716\t3.4\n0.226775956\t2.1\n0.229508197\t1.8\n0.232240437\t4.3\n0.234972678\t7.0\n0.237704918\t7.7\n0.240437158\t6.2\n0.243169399\t7.5\n0.245901639\t4.9\n0.24863388\t4.4\n0.25136612\t3.8\n0.254098361\t6.4\n0.256830601\t8.0\n0.259562842\t7.9\n0.262295082\t8.9\n0.265027322\t6.6\n0.267759563\t6.5\n0.270491803\t5.8\n0.273224044\t5.6\n0.275956284\t4.7\n0.278688525\t5.5\n0.281420765\t5.5\n0.284153005\t5.8\n0.286885246\t5.3\n0.289617486\t6.9\n0.292349727\t5.9\n0.295081967\t6.1\n0.297814208\t6.6\n0.300546448\t6.7\n0.303278689\t6.5\n0.306010929\t7.0\n0.308743169\t5.8\n0.31147541\t3.0\n0.31420765\t2.5\n0.316939891\t2.4\n0.319672131\t4.3\n0.322404372\t2.8\n0.325136612\t3.6\n0.327868852\t6.8\n0.330601093\t9.1\n0.333333333\t8.4\n0.336065574\t9.3\n0.338797814\t13.3\n0.341530055\t10.6\n0.344262295\t10.5\n0.346994536\t11.8\n0.349726776\t14.7\n0.352459016\t16.2\n0.355191257\t16.4\n0.357923497\t16.9\n0.360655738\t12.3\n0.363387978\t10.2\n0.366120219\t11.2\n0.368852459\t6.1\n0.371584699\t6.4\n0.37431694\t6.1\n0.37704918\t10.4\n0.379781421\t10.3\n0.382513661\t11.9\n0.385245902\t12.9\n0.387978142\t12.5\n0.390710383\t17.5\n0.393442623\t19.9\n0.396174863\t19.3\n0.398907104\t11.4\n0.401639344\t9.7\n0.404371585\t10.7\n0.407103825\t13.0\n0.409836066\t12.4\n0.412568306\t16.3\n0.415300546\t19.2\n0.418032787\t19.2\n0.420765027\t19.8\n0.423497268\t19.5\n0.426229508\t16.6\n0.428961749\t13.0\n0.431693989\t12.6\n0.43442623\t17.6\n0.43715847\t13.7\n0.43989071\t11.3\n0.442622951\t10.2\n0.445355191\t10.2\n0.448087432\t11.6\n0.450819672\t14.2\n0.453551913\t14.4\n0.456284153\t17.4\n0.459016393\t13.1\n0.461748634\t17.4\n0.464480874\t15.9\n0.467213115\t15.9\n0.469945355\t15.5\n0.472677596\t16.4\n0.475409836\t16.7\n0.478142077\t18.2\n0.480874317\t20.9\n0.483606557\t22.2\n0.486338798\t19.1\n0.489071038\t16.3\n0.491803279\t16.6\n0.494535519\t15.1\n0.49726776\t14.5\n0.5\t17.4\n0.50273224\t16.5\n0.505464481\t13.7\n0.508196721\t14.0\n0.510928962\t14.2\n0.513661202\t15.6\n0.516393443\t15.7\n0.519125683\t15.6\n0.521857923\t16.2\n0.524590164\t16.3\n0.527322404\t18.3\n0.530054645\t16.6\n0.532786885\t16.1\n0.535519126\t15.9\n0.538251366\t16.0\n0.540983607\t15.9\n0.543715847\t16.0\n0.546448087\t15.7\n0.549180328\t17.2\n0.551912568\t19.9\n0.554644809\t21.0\n0.557377049\t19.4\n0.56010929\t20.4\n0.56284153\t23.1\n0.56557377\t23.0\n0.568306011\t19.9\n0.571038251\t17.6\n0.573770492\t18.8\n0.576502732\t17.8\n0.579234973\t18.6\n0.581967213\t16.4\n0.584699454\t15.2\n0.587431694\t15.3\n0.590163934\t16.0\n0.592896175\t18.0\n0.595628415\t17.7\n0.598360656\t16.0\n0.601092896\t16.4\n0.603825137\t16.7\n0.606557377\t14.3\n0.609289617\t12.2\n0.612021858\t10.0\n0.614754098\t12.0\n0.617486339\t16.2\n0.620218579\t15.9\n0.62295082\t14.5\n0.62568306\t15.3\n0.628415301\t13.3\n0.631147541\t14.5\n0.633879781\t15.5\n0.636612022\t15.3\n0.639344262\t17.3\n0.642076503\t15.3\n0.644808743\t16.4\n0.647540984\t17.0\n0.650273224\t20.2\n0.653005464\t22.4\n0.655737705\t18.1\n0.658469945\t11.6\n0.661202186\t14.6\n0.663934426\t13.5\n0.666666667\t17.9\n0.669398907\t16.4\n0.672131148\t15.5\n0.674863388\t15.9\n0.677595628\t14.1\n0.680327869\t13.2\n0.683060109\t14.5\n0.68579235\t19.0\n0.68852459\t18.3\n0.691256831\t18.8\n0.693989071\t16.8\n0.696721311\t16.8\n0.699453552\t14.3\n0.702185792\t18.4\n0.704918033\t18.3\n0.707650273\t18.4\n0.710382514\t14.9\n0.713114754\t11.4\n0.715846995\t12.6\n0.718579235\t14.0\n0.721311475\t14.8\n0.724043716\t9.9\n0.726775956\t11.4\n0.729508197\t12.9\n0.732240437\t12.1\n0.734972678\t12.8\n0.737704918\t13.5\n0.740437158\t12.9\n0.743169399\t14.0\n0.745901639\t14.6\n0.74863388\t12.0\n0.75136612\t10.5\n0.754098361\t9.5\n0.756830601\t7.6\n0.759562842\t6.4\n0.762295082\t7.0\n0.765027322\t8.1\n0.767759563\t8.1\n0.770491803\t7.6\n0.773224044\t7.4\n0.775956284\t7.2\n0.778688525\t7.0\n0.781420765\t6.4\n0.784153005\t5.8\n0.786885246\t5.5\n0.789617486\t6.4\n0.792349727\t7.3\n0.795081967\t7.4\n0.797814208\t7.8\n0.800546448\t7.9\n0.803278689\t6.9\n0.806010929\t6.1\n0.808743169\t3.7\n0.81147541\t5.3\n0.81420765\t6.1\n0.816939891\t4.3\n0.819672131\t3.3\n0.822404372\t8.8\n0.825136612\t9.8\n0.827868852\t6.4\n0.830601093\t4.6\n0.833333333\t5.2\n0.836065574\t5.5\n0.838797814\t1.4\n0.841530055\t0.5\n0.844262295\t-2.6\n0.846994536\t2.4\n0.849726776\t-0.8\n0.852459016\t-3.3\n0.855191257\t-2.8\n0.857923497\t-3.5\n0.860655738\t-2.8\n0.863387978\t-2.2\n0.866120219\t-0.3\n0.868852459\t0.0\n0.871584699\t2.3\n0.87431694\t4.9\n0.87704918\t3.1\n0.879781421\t3.6\n0.882513661\t5.2\n0.885245902\t3.8\n0.887978142\t3.2\n0.890710383\t7.7\n0.893442623\t7.8\n0.896174863\t6.9\n0.898907104\t2.7\n0.901639344\t2.8\n0.904371585\t6.6\n0.907103825\t1.9\n0.909836066\t-1.4\n0.912568306\t2.2\n0.915300546\t1.9\n0.918032787\t-1.3\n0.920765027\t-1.6\n0.923497268\t-3.2\n0.926229508\t-2.7\n0.928961749\t3.7\n0.931693989\t-3.2\n0.93442623\t-0.2\n0.93715847\t9.3\n0.93989071\t7.1\n0.942622951\t3.2\n0.945355191\t1.1\n0.948087432\t-6.0\n0.950819672\t1.7\n0.953551913\t-1.3\n0.956284153\t-2.2\n0.959016393\t-1.2\n0.961748634\t1.0\n0.964480874\t1.7\n0.967213115\t3.7\n0.969945355\t4.7\n0.972677596\t-0.3\n0.975409836\t3.5\n0.978142077\t3.4\n0.980874317\t3.9\n0.983606557\t4.5\n0.986338798\t5.3\n0.989071038\t2.7\n0.991803279\t-0.4\n0.994535519\t4.3\n0.99726776\t7.0\n1\t9.3"
  },
  {
    "path": "mlfromscratch/deep_learning/__init__.py",
    "content": "from .neural_network import NeuralNetwork\n"
  },
  {
    "path": "mlfromscratch/deep_learning/activation_functions.py",
    "content": "import numpy as np\n\n# Collection of activation functions\n# Reference: https://en.wikipedia.org/wiki/Activation_function\n\nclass Sigmoid():\n    def __call__(self, x):\n        return 1 / (1 + np.exp(-x))\n\n    def gradient(self, x):\n        return self.__call__(x) * (1 - self.__call__(x))\n\nclass Softmax():\n    def __call__(self, x):\n        e_x = np.exp(x - np.max(x, axis=-1, keepdims=True))\n        return e_x / np.sum(e_x, axis=-1, keepdims=True)\n\n    def gradient(self, x):\n        p = self.__call__(x)\n        return p * (1 - p)\n\nclass TanH():\n    def __call__(self, x):\n        return 2 / (1 + np.exp(-2*x)) - 1\n\n    def gradient(self, x):\n        return 1 - np.power(self.__call__(x), 2)\n\nclass ReLU():\n    def __call__(self, x):\n        return np.where(x >= 0, x, 0)\n\n    def gradient(self, x):\n        return np.where(x >= 0, 1, 0)\n\nclass LeakyReLU():\n    def __init__(self, alpha=0.2):\n        self.alpha = alpha\n\n    def __call__(self, x):\n        return np.where(x >= 0, x, self.alpha * x)\n\n    def gradient(self, x):\n        return np.where(x >= 0, 1, self.alpha)\n\nclass ELU():\n    def __init__(self, alpha=0.1):\n        self.alpha = alpha \n\n    def __call__(self, x):\n        return np.where(x >= 0.0, x, self.alpha * (np.exp(x) - 1))\n\n    def gradient(self, x):\n        return np.where(x >= 0.0, 1, self.__call__(x) + self.alpha)\n\nclass SELU():\n    # Reference : https://arxiv.org/abs/1706.02515,\n    # https://github.com/bioinf-jku/SNNs/blob/master/SelfNormalizingNetworks_MLP_MNIST.ipynb\n    def __init__(self):\n        self.alpha = 1.6732632423543772848170429916717\n        self.scale = 1.0507009873554804934193349852946 \n\n    def __call__(self, x):\n        return self.scale * np.where(x >= 0.0, x, self.alpha*(np.exp(x)-1))\n\n    def gradient(self, x):\n        return self.scale * np.where(x >= 0.0, 1, self.alpha * np.exp(x))\n\nclass SoftPlus():\n    def __call__(self, x):\n        return np.log(1 + np.exp(x))\n\n    def gradient(self, x):\n        return 1 / (1 + np.exp(-x))\n\n"
  },
  {
    "path": "mlfromscratch/deep_learning/layers.py",
    "content": "\nfrom __future__ import print_function, division\nimport math\nimport numpy as np\nimport copy\nfrom mlfromscratch.deep_learning.activation_functions import Sigmoid, ReLU, SoftPlus, LeakyReLU\nfrom mlfromscratch.deep_learning.activation_functions import TanH, ELU, SELU, Softmax\n\n\nclass Layer(object):\n\n    def set_input_shape(self, shape):\n        \"\"\" Sets the shape that the layer expects of the input in the forward\n        pass method \"\"\"\n        self.input_shape = shape\n\n    def layer_name(self):\n        \"\"\" The name of the layer. Used in model summary. \"\"\"\n        return self.__class__.__name__\n\n    def parameters(self):\n        \"\"\" The number of trainable parameters used by the layer \"\"\"\n        return 0\n\n    def forward_pass(self, X, training):\n        \"\"\" Propogates the signal forward in the network \"\"\"\n        raise NotImplementedError()\n\n    def backward_pass(self, accum_grad):\n        \"\"\" Propogates the accumulated gradient backwards in the network.\n        If the has trainable weights then these weights are also tuned in this method.\n        As input (accum_grad) it receives the gradient with respect to the output of the layer and\n        returns the gradient with respect to the output of the previous layer. \"\"\"\n        raise NotImplementedError()\n\n    def output_shape(self):\n        \"\"\" The shape of the output produced by forward_pass \"\"\"\n        raise NotImplementedError()\n\n\nclass Dense(Layer):\n    \"\"\"A fully-connected NN layer.\n    Parameters:\n    -----------\n    n_units: int\n        The number of neurons in the layer.\n    input_shape: tuple\n        The expected input shape of the layer. For dense layers a single digit specifying\n        the number of features of the input. Must be specified if it is the first layer in\n        the network.\n    \"\"\"\n    def __init__(self, n_units, input_shape=None):\n        self.layer_input = None\n        self.input_shape = input_shape\n        self.n_units = n_units\n        self.trainable = True\n        self.W = None\n        self.w0 = None\n\n    def initialize(self, optimizer):\n        # Initialize the weights\n        limit = 1 / math.sqrt(self.input_shape[0])\n        self.W  = np.random.uniform(-limit, limit, (self.input_shape[0], self.n_units))\n        self.w0 = np.zeros((1, self.n_units))\n        # Weight optimizers\n        self.W_opt  = copy.copy(optimizer)\n        self.w0_opt = copy.copy(optimizer)\n\n    def parameters(self):\n        return np.prod(self.W.shape) + np.prod(self.w0.shape)\n\n    def forward_pass(self, X, training=True):\n        self.layer_input = X\n        return X.dot(self.W) + self.w0\n\n    def backward_pass(self, accum_grad):\n        # Save weights used during forwards pass\n        W = self.W\n\n        if self.trainable:\n            # Calculate gradient w.r.t layer weights\n            grad_w = self.layer_input.T.dot(accum_grad)\n            grad_w0 = np.sum(accum_grad, axis=0, keepdims=True)\n\n            # Update the layer weights\n            self.W = self.W_opt.update(self.W, grad_w)\n            self.w0 = self.w0_opt.update(self.w0, grad_w0)\n\n        # Return accumulated gradient for next layer\n        # Calculated based on the weights used during the forward pass\n        accum_grad = accum_grad.dot(W.T)\n        return accum_grad\n\n    def output_shape(self):\n        return (self.n_units, )\n\n\nclass RNN(Layer):\n    \"\"\"A Vanilla Fully-Connected Recurrent Neural Network layer.\n\n    Parameters:\n    -----------\n    n_units: int\n        The number of hidden states in the layer.\n    activation: string\n        The name of the activation function which will be applied to the output of each state.\n    bptt_trunc: int\n        Decides how many time steps the gradient should be propagated backwards through states\n        given the loss gradient for time step t.\n    input_shape: tuple\n        The expected input shape of the layer. For dense layers a single digit specifying\n        the number of features of the input. Must be specified if it is the first layer in\n        the network.\n\n    Reference:\n    http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-2-implementing-a-language-model-rnn-with-python-numpy-and-theano/\n    \"\"\"\n    def __init__(self, n_units, activation='tanh', bptt_trunc=5, input_shape=None):\n        self.input_shape = input_shape\n        self.n_units = n_units\n        self.activation = activation_functions[activation]()\n        self.trainable = True\n        self.bptt_trunc = bptt_trunc\n        self.W = None # Weight of the previous state\n        self.V = None # Weight of the output\n        self.U = None # Weight of the input\n\n    def initialize(self, optimizer):\n        timesteps, input_dim = self.input_shape\n        # Initialize the weights\n        limit = 1 / math.sqrt(input_dim)\n        self.U  = np.random.uniform(-limit, limit, (self.n_units, input_dim))\n        limit = 1 / math.sqrt(self.n_units)\n        self.V = np.random.uniform(-limit, limit, (input_dim, self.n_units))\n        self.W  = np.random.uniform(-limit, limit, (self.n_units, self.n_units))\n        # Weight optimizers\n        self.U_opt  = copy.copy(optimizer)\n        self.V_opt = copy.copy(optimizer)\n        self.W_opt = copy.copy(optimizer)\n\n    def parameters(self):\n        return np.prod(self.W.shape) + np.prod(self.U.shape) + np.prod(self.V.shape)\n\n    def forward_pass(self, X, training=True):\n        self.layer_input = X\n        batch_size, timesteps, input_dim = X.shape\n\n        # Save these values for use in backprop.\n        self.state_input = np.zeros((batch_size, timesteps, self.n_units))\n        self.states = np.zeros((batch_size, timesteps+1, self.n_units))\n        self.outputs = np.zeros((batch_size, timesteps, input_dim))\n\n        # Set last time step to zero for calculation of the state_input at time step zero\n        self.states[:, -1] = np.zeros((batch_size, self.n_units))\n        for t in range(timesteps):\n            # Input to state_t is the current input and output of previous states\n            self.state_input[:, t] = X[:, t].dot(self.U.T) + self.states[:, t-1].dot(self.W.T)\n            self.states[:, t] = self.activation(self.state_input[:, t])\n            self.outputs[:, t] = self.states[:, t].dot(self.V.T)\n\n        return self.outputs\n\n    def backward_pass(self, accum_grad):\n        _, timesteps, _ = accum_grad.shape\n\n        # Variables where we save the accumulated gradient w.r.t each parameter\n        grad_U = np.zeros_like(self.U)\n        grad_V = np.zeros_like(self.V)\n        grad_W = np.zeros_like(self.W)\n        # The gradient w.r.t the layer input.\n        # Will be passed on to the previous layer in the network\n        accum_grad_next = np.zeros_like(accum_grad)\n\n        # Back Propagation Through Time\n        for t in reversed(range(timesteps)):\n            # Update gradient w.r.t V at time step t\n            grad_V += accum_grad[:, t].T.dot(self.states[:, t])\n            # Calculate the gradient w.r.t the state input\n            grad_wrt_state = accum_grad[:, t].dot(self.V) * self.activation.gradient(self.state_input[:, t])\n            # Gradient w.r.t the layer input\n            accum_grad_next[:, t] = grad_wrt_state.dot(self.U)\n            # Update gradient w.r.t W and U by backprop. from time step t for at most\n            # self.bptt_trunc number of time steps\n            for t_ in reversed(np.arange(max(0, t - self.bptt_trunc), t+1)):\n                grad_U += grad_wrt_state.T.dot(self.layer_input[:, t_])\n                grad_W += grad_wrt_state.T.dot(self.states[:, t_-1])\n                # Calculate gradient w.r.t previous state\n                grad_wrt_state = grad_wrt_state.dot(self.W) * self.activation.gradient(self.state_input[:, t_-1])\n\n        # Update weights\n        self.U = self.U_opt.update(self.U, grad_U)\n        self.V = self.V_opt.update(self.V, grad_V)\n        self.W = self.W_opt.update(self.W, grad_W)\n\n        return accum_grad_next\n\n    def output_shape(self):\n        return self.input_shape\n\nclass Conv2D(Layer):\n    \"\"\"A 2D Convolution Layer.\n\n    Parameters:\n    -----------\n    n_filters: int\n        The number of filters that will convolve over the input matrix. The number of channels\n        of the output shape.\n    filter_shape: tuple\n        A tuple (filter_height, filter_width).\n    input_shape: tuple\n        The shape of the expected input of the layer. (batch_size, channels, height, width)\n        Only needs to be specified for first layer in the network.\n    padding: string\n        Either 'same' or 'valid'. 'same' results in padding being added so that the output height and width\n        matches the input height and width. For 'valid' no padding is added.\n    stride: int\n        The stride length of the filters during the convolution over the input.\n    \"\"\"\n    def __init__(self, n_filters, filter_shape, input_shape=None, padding='same', stride=1):\n        self.n_filters = n_filters\n        self.filter_shape = filter_shape\n        self.padding = padding\n        self.stride = stride\n        self.input_shape = input_shape\n        self.trainable = True\n\n    def initialize(self, optimizer):\n        # Initialize the weights\n        filter_height, filter_width = self.filter_shape\n        channels = self.input_shape[0]\n        limit = 1 / math.sqrt(np.prod(self.filter_shape))\n        self.W  = np.random.uniform(-limit, limit, size=(self.n_filters, channels, filter_height, filter_width))\n        self.w0 = np.zeros((self.n_filters, 1))\n        # Weight optimizers\n        self.W_opt  = copy.copy(optimizer)\n        self.w0_opt = copy.copy(optimizer)\n\n    def parameters(self):\n        return np.prod(self.W.shape) + np.prod(self.w0.shape)\n\n    def forward_pass(self, X, training=True):\n        batch_size, channels, height, width = X.shape\n        self.layer_input = X\n        # Turn image shape into column shape\n        # (enables dot product between input and weights)\n        self.X_col = image_to_column(X, self.filter_shape, stride=self.stride, output_shape=self.padding)\n        # Turn weights into column shape\n        self.W_col = self.W.reshape((self.n_filters, -1))\n        # Calculate output\n        output = self.W_col.dot(self.X_col) + self.w0\n        # Reshape into (n_filters, out_height, out_width, batch_size)\n        output = output.reshape(self.output_shape() + (batch_size, ))\n        # Redistribute axises so that batch size comes first\n        return output.transpose(3,0,1,2)\n\n    def backward_pass(self, accum_grad):\n        # Reshape accumulated gradient into column shape\n        accum_grad = accum_grad.transpose(1, 2, 3, 0).reshape(self.n_filters, -1)\n\n        if self.trainable:\n            # Take dot product between column shaped accum. gradient and column shape\n            # layer input to determine the gradient at the layer with respect to layer weights\n            grad_w = accum_grad.dot(self.X_col.T).reshape(self.W.shape)\n            # The gradient with respect to bias terms is the sum similarly to in Dense layer\n            grad_w0 = np.sum(accum_grad, axis=1, keepdims=True)\n\n            # Update the layers weights\n            self.W = self.W_opt.update(self.W, grad_w)\n            self.w0 = self.w0_opt.update(self.w0, grad_w0)\n\n        # Recalculate the gradient which will be propogated back to prev. layer\n        accum_grad = self.W_col.T.dot(accum_grad)\n        # Reshape from column shape to image shape\n        accum_grad = column_to_image(accum_grad,\n                                self.layer_input.shape,\n                                self.filter_shape,\n                                stride=self.stride,\n                                output_shape=self.padding)\n\n        return accum_grad\n\n    def output_shape(self):\n        channels, height, width = self.input_shape\n        pad_h, pad_w = determine_padding(self.filter_shape, output_shape=self.padding)\n        output_height = (height + np.sum(pad_h) - self.filter_shape[0]) / self.stride + 1\n        output_width = (width + np.sum(pad_w) - self.filter_shape[1]) / self.stride + 1\n        return self.n_filters, int(output_height), int(output_width)\n\n\nclass BatchNormalization(Layer):\n    \"\"\"Batch normalization.\n    \"\"\"\n    def __init__(self, momentum=0.99):\n        self.momentum = momentum\n        self.trainable = True\n        self.eps = 0.01\n        self.running_mean = None\n        self.running_var = None\n\n    def initialize(self, optimizer):\n        # Initialize the parameters\n        self.gamma  = np.ones(self.input_shape)\n        self.beta = np.zeros(self.input_shape)\n        # parameter optimizers\n        self.gamma_opt  = copy.copy(optimizer)\n        self.beta_opt = copy.copy(optimizer)\n\n    def parameters(self):\n        return np.prod(self.gamma.shape) + np.prod(self.beta.shape)\n\n    def forward_pass(self, X, training=True):\n\n        # Initialize running mean and variance if first run\n        if self.running_mean is None:\n            self.running_mean = np.mean(X, axis=0)\n            self.running_var = np.var(X, axis=0)\n\n        if training and self.trainable:\n            mean = np.mean(X, axis=0)\n            var = np.var(X, axis=0)\n            self.running_mean = self.momentum * self.running_mean + (1 - self.momentum) * mean\n            self.running_var = self.momentum * self.running_var + (1 - self.momentum) * var\n        else:\n            mean = self.running_mean\n            var = self.running_var\n\n        # Statistics saved for backward pass\n        self.X_centered = X - mean\n        self.stddev_inv = 1 / np.sqrt(var + self.eps)\n\n        X_norm = self.X_centered * self.stddev_inv\n        output = self.gamma * X_norm + self.beta\n\n        return output\n\n    def backward_pass(self, accum_grad):\n\n        # Save parameters used during the forward pass\n        gamma = self.gamma\n\n        # If the layer is trainable the parameters are updated\n        if self.trainable:\n            X_norm = self.X_centered * self.stddev_inv\n            grad_gamma = np.sum(accum_grad * X_norm, axis=0)\n            grad_beta = np.sum(accum_grad, axis=0)\n\n            self.gamma = self.gamma_opt.update(self.gamma, grad_gamma)\n            self.beta = self.beta_opt.update(self.beta, grad_beta)\n\n        batch_size = accum_grad.shape[0]\n\n        # The gradient of the loss with respect to the layer inputs (use weights and statistics from forward pass)\n        accum_grad = (1 / batch_size) * gamma * self.stddev_inv * (\n            batch_size * accum_grad\n            - np.sum(accum_grad, axis=0)\n            - self.X_centered * self.stddev_inv**2 * np.sum(accum_grad * self.X_centered, axis=0)\n            )\n\n        return accum_grad\n\n    def output_shape(self):\n        return self.input_shape\n\n\nclass PoolingLayer(Layer):\n    \"\"\"A parent class of MaxPooling2D and AveragePooling2D\n    \"\"\"\n    def __init__(self, pool_shape=(2, 2), stride=1, padding=0):\n        self.pool_shape = pool_shape\n        self.stride = stride\n        self.padding = padding\n        self.trainable = True\n\n    def forward_pass(self, X, training=True):\n        self.layer_input = X\n\n        batch_size, channels, height, width = X.shape\n\n        _, out_height, out_width = self.output_shape()\n\n        X = X.reshape(batch_size*channels, 1, height, width)\n        X_col = image_to_column(X, self.pool_shape, self.stride, self.padding)\n\n        # MaxPool or AveragePool specific method\n        output = self._pool_forward(X_col)\n\n        output = output.reshape(out_height, out_width, batch_size, channels)\n        output = output.transpose(2, 3, 0, 1)\n\n        return output\n\n    def backward_pass(self, accum_grad):\n        batch_size, _, _, _ = accum_grad.shape\n        channels, height, width = self.input_shape\n        accum_grad = accum_grad.transpose(2, 3, 0, 1).ravel()\n\n        # MaxPool or AveragePool specific method\n        accum_grad_col = self._pool_backward(accum_grad)\n\n        accum_grad = column_to_image(accum_grad_col, (batch_size * channels, 1, height, width), self.pool_shape, self.stride, 0)\n        accum_grad = accum_grad.reshape((batch_size,) + self.input_shape)\n\n        return accum_grad\n\n    def output_shape(self):\n        channels, height, width = self.input_shape\n        out_height = (height - self.pool_shape[0]) / self.stride + 1\n        out_width = (width - self.pool_shape[1]) / self.stride + 1\n        assert out_height % 1 == 0\n        assert out_width % 1 == 0\n        return channels, int(out_height), int(out_width)\n\n\nclass MaxPooling2D(PoolingLayer):\n    def _pool_forward(self, X_col):\n        arg_max = np.argmax(X_col, axis=0).flatten()\n        output = X_col[arg_max, range(arg_max.size)]\n        self.cache = arg_max\n        return output\n\n    def _pool_backward(self, accum_grad):\n        accum_grad_col = np.zeros((np.prod(self.pool_shape), accum_grad.size))\n        arg_max = self.cache\n        accum_grad_col[arg_max, range(accum_grad.size)] = accum_grad\n        return accum_grad_col\n\nclass AveragePooling2D(PoolingLayer):\n    def _pool_forward(self, X_col):\n        output = np.mean(X_col, axis=0)\n        return output\n\n    def _pool_backward(self, accum_grad):\n        accum_grad_col = np.zeros((np.prod(self.pool_shape), accum_grad.size))\n        accum_grad_col[:, range(accum_grad.size)] = 1. / accum_grad_col.shape[0] * accum_grad\n        return accum_grad_col\n\n\nclass ConstantPadding2D(Layer):\n    \"\"\"Adds rows and columns of constant values to the input.\n    Expects the input to be of shape (batch_size, channels, height, width)\n\n    Parameters:\n    -----------\n    padding: tuple\n        The amount of padding along the height and width dimension of the input.\n        If (pad_h, pad_w) the same symmetric padding is applied along height and width dimension.\n        If ((pad_h0, pad_h1), (pad_w0, pad_w1)) the specified padding is added to beginning and end of\n        the height and width dimension.\n    padding_value: int or tuple\n        The value the is added as padding.\n    \"\"\"\n    def __init__(self, padding, padding_value=0):\n        self.padding = padding\n        self.trainable = True\n        if not isinstance(padding[0], tuple):\n            self.padding = ((padding[0], padding[0]), padding[1])\n        if not isinstance(padding[1], tuple):\n            self.padding = (self.padding[0], (padding[1], padding[1]))\n        self.padding_value = padding_value\n\n    def forward_pass(self, X, training=True):\n        output = np.pad(X,\n            pad_width=((0,0), (0,0), self.padding[0], self.padding[1]),\n            mode=\"constant\",\n            constant_values=self.padding_value)\n        return output\n\n    def backward_pass(self, accum_grad):\n        pad_top, pad_left = self.padding[0][0], self.padding[1][0]\n        height, width = self.input_shape[1], self.input_shape[2]\n        accum_grad = accum_grad[:, :, pad_top:pad_top+height, pad_left:pad_left+width]\n        return accum_grad\n\n    def output_shape(self):\n        new_height = self.input_shape[1] + np.sum(self.padding[0])\n        new_width = self.input_shape[2] + np.sum(self.padding[1])\n        return (self.input_shape[0], new_height, new_width)\n\n\nclass ZeroPadding2D(ConstantPadding2D):\n    \"\"\"Adds rows and columns of zero values to the input.\n    Expects the input to be of shape (batch_size, channels, height, width)\n\n    Parameters:\n    -----------\n    padding: tuple\n        The amount of padding along the height and width dimension of the input.\n        If (pad_h, pad_w) the same symmetric padding is applied along height and width dimension.\n        If ((pad_h0, pad_h1), (pad_w0, pad_w1)) the specified padding is added to beginning and end of\n        the height and width dimension.\n    \"\"\"\n    def __init__(self, padding):\n        self.padding = padding\n        if isinstance(padding[0], int):\n            self.padding = ((padding[0], padding[0]), padding[1])\n        if isinstance(padding[1], int):\n            self.padding = (self.padding[0], (padding[1], padding[1]))\n        self.padding_value = 0\n\n\nclass Flatten(Layer):\n    \"\"\" Turns a multidimensional matrix into two-dimensional \"\"\"\n    def __init__(self, input_shape=None):\n        self.prev_shape = None\n        self.trainable = True\n        self.input_shape = input_shape\n\n    def forward_pass(self, X, training=True):\n        self.prev_shape = X.shape\n        return X.reshape((X.shape[0], -1))\n\n    def backward_pass(self, accum_grad):\n        return accum_grad.reshape(self.prev_shape)\n\n    def output_shape(self):\n        return (np.prod(self.input_shape),)\n\n\nclass UpSampling2D(Layer):\n    \"\"\" Nearest neighbor up sampling of the input. Repeats the rows and\n    columns of the data by size[0] and size[1] respectively.\n\n    Parameters:\n    -----------\n    size: tuple\n        (size_y, size_x) - The number of times each axis will be repeated.\n    \"\"\"\n    def __init__(self, size=(2,2), input_shape=None):\n        self.prev_shape = None\n        self.trainable = True\n        self.size = size\n        self.input_shape = input_shape\n\n    def forward_pass(self, X, training=True):\n        self.prev_shape = X.shape\n        # Repeat each axis as specified by size\n        X_new = X.repeat(self.size[0], axis=2).repeat(self.size[1], axis=3)\n        return X_new\n\n    def backward_pass(self, accum_grad):\n        # Down sample input to previous shape\n        accum_grad = accum_grad[:, :, ::self.size[0], ::self.size[1]]\n        return accum_grad\n\n    def output_shape(self):\n        channels, height, width = self.input_shape\n        return channels, self.size[0] * height, self.size[1] * width\n\n\nclass Reshape(Layer):\n    \"\"\" Reshapes the input tensor into specified shape\n\n    Parameters:\n    -----------\n    shape: tuple\n        The shape which the input shall be reshaped to.\n    \"\"\"\n    def __init__(self, shape, input_shape=None):\n        self.prev_shape = None\n        self.trainable = True\n        self.shape = shape\n        self.input_shape = input_shape\n\n    def forward_pass(self, X, training=True):\n        self.prev_shape = X.shape\n        return X.reshape((X.shape[0], ) + self.shape)\n\n    def backward_pass(self, accum_grad):\n        return accum_grad.reshape(self.prev_shape)\n\n    def output_shape(self):\n        return self.shape\n\n\nclass Dropout(Layer):\n    \"\"\"A layer that randomly sets a fraction p of the output units of the previous layer\n    to zero.\n\n    Parameters:\n    -----------\n    p: float\n        The probability that unit x is set to zero.\n    \"\"\"\n    def __init__(self, p=0.2):\n        self.p = p\n        self._mask = None\n        self.input_shape = None\n        self.n_units = None\n        self.pass_through = True\n        self.trainable = True\n\n    def forward_pass(self, X, training=True):\n        c = (1 - self.p)\n        if training:\n            self._mask = np.random.uniform(size=X.shape) > self.p\n            c = self._mask\n        return X * c\n\n    def backward_pass(self, accum_grad):\n        return accum_grad * self._mask\n\n    def output_shape(self):\n        return self.input_shape\n\nactivation_functions = {\n    'relu': ReLU,\n    'sigmoid': Sigmoid,\n    'selu': SELU,\n    'elu': ELU,\n    'softmax': Softmax,\n    'leaky_relu': LeakyReLU,\n    'tanh': TanH,\n    'softplus': SoftPlus\n}\n\nclass Activation(Layer):\n    \"\"\"A layer that applies an activation operation to the input.\n\n    Parameters:\n    -----------\n    name: string\n        The name of the activation function that will be used.\n    \"\"\"\n\n    def __init__(self, name):\n        self.activation_name = name\n        self.activation_func = activation_functions[name]()\n        self.trainable = True\n\n    def layer_name(self):\n        return \"Activation (%s)\" % (self.activation_func.__class__.__name__)\n\n    def forward_pass(self, X, training=True):\n        self.layer_input = X\n        return self.activation_func(X)\n\n    def backward_pass(self, accum_grad):\n        return accum_grad * self.activation_func.gradient(self.layer_input)\n\n    def output_shape(self):\n        return self.input_shape\n\n\n# Method which calculates the padding based on the specified output shape and the\n# shape of the filters\ndef determine_padding(filter_shape, output_shape=\"same\"):\n\n    # No padding\n    if output_shape == \"valid\":\n        return (0, 0), (0, 0)\n    # Pad so that the output shape is the same as input shape (given that stride=1)\n    elif output_shape == \"same\":\n        filter_height, filter_width = filter_shape\n\n        # Derived from:\n        # output_height = (height + pad_h - filter_height) / stride + 1\n        # In this case output_height = height and stride = 1. This gives the\n        # expression for the padding below.\n        pad_h1 = int(math.floor((filter_height - 1)/2))\n        pad_h2 = int(math.ceil((filter_height - 1)/2))\n        pad_w1 = int(math.floor((filter_width - 1)/2))\n        pad_w2 = int(math.ceil((filter_width - 1)/2))\n\n        return (pad_h1, pad_h2), (pad_w1, pad_w2)\n\n\n# Reference: CS231n Stanford\ndef get_im2col_indices(images_shape, filter_shape, padding, stride=1):\n    # First figure out what the size of the output should be\n    batch_size, channels, height, width = images_shape\n    filter_height, filter_width = filter_shape\n    pad_h, pad_w = padding\n    out_height = int((height + np.sum(pad_h) - filter_height) / stride + 1)\n    out_width = int((width + np.sum(pad_w) - filter_width) / stride + 1)\n\n    i0 = np.repeat(np.arange(filter_height), filter_width)\n    i0 = np.tile(i0, channels)\n    i1 = stride * np.repeat(np.arange(out_height), out_width)\n    j0 = np.tile(np.arange(filter_width), filter_height * channels)\n    j1 = stride * np.tile(np.arange(out_width), out_height)\n    i = i0.reshape(-1, 1) + i1.reshape(1, -1)\n    j = j0.reshape(-1, 1) + j1.reshape(1, -1)\n\n    k = np.repeat(np.arange(channels), filter_height * filter_width).reshape(-1, 1)\n\n    return (k, i, j)\n\n\n# Method which turns the image shaped input to column shape.\n# Used during the forward pass.\n# Reference: CS231n Stanford\ndef image_to_column(images, filter_shape, stride, output_shape='same'):\n    filter_height, filter_width = filter_shape\n\n    pad_h, pad_w = determine_padding(filter_shape, output_shape)\n\n    # Add padding to the image\n    images_padded = np.pad(images, ((0, 0), (0, 0), pad_h, pad_w), mode='constant')\n\n    # Calculate the indices where the dot products are to be applied between weights\n    # and the image\n    k, i, j = get_im2col_indices(images.shape, filter_shape, (pad_h, pad_w), stride)\n\n    # Get content from image at those indices\n    cols = images_padded[:, k, i, j]\n    channels = images.shape[1]\n    # Reshape content into column shape\n    cols = cols.transpose(1, 2, 0).reshape(filter_height * filter_width * channels, -1)\n    return cols\n\n\n\n# Method which turns the column shaped input to image shape.\n# Used during the backward pass.\n# Reference: CS231n Stanford\ndef column_to_image(cols, images_shape, filter_shape, stride, output_shape='same'):\n    batch_size, channels, height, width = images_shape\n    pad_h, pad_w = determine_padding(filter_shape, output_shape)\n    height_padded = height + np.sum(pad_h)\n    width_padded = width + np.sum(pad_w)\n    images_padded = np.zeros((batch_size, channels, height_padded, width_padded))\n\n    # Calculate the indices where the dot products are applied between weights\n    # and the image\n    k, i, j = get_im2col_indices(images_shape, filter_shape, (pad_h, pad_w), stride)\n\n    cols = cols.reshape(channels * np.prod(filter_shape), -1, batch_size)\n    cols = cols.transpose(2, 0, 1)\n    # Add column content to the images at the indices\n    np.add.at(images_padded, (slice(None), k, i, j), cols)\n\n    # Return image without padding\n    return images_padded[:, :, pad_h[0]:height+pad_h[0], pad_w[0]:width+pad_w[0]]\n"
  },
  {
    "path": "mlfromscratch/deep_learning/loss_functions.py",
    "content": "from __future__ import division\nimport numpy as np\nfrom mlfromscratch.utils import accuracy_score\nfrom mlfromscratch.deep_learning.activation_functions import Sigmoid\n\nclass Loss(object):\n    def loss(self, y_true, y_pred):\n        return NotImplementedError()\n\n    def gradient(self, y, y_pred):\n        raise NotImplementedError()\n\n    def acc(self, y, y_pred):\n        return 0\n\nclass SquareLoss(Loss):\n    def __init__(self): pass\n\n    def loss(self, y, y_pred):\n        return 0.5 * np.power((y - y_pred), 2)\n\n    def gradient(self, y, y_pred):\n        return -(y - y_pred)\n\nclass CrossEntropy(Loss):\n    def __init__(self): pass\n\n    def loss(self, y, p):\n        # Avoid division by zero\n        p = np.clip(p, 1e-15, 1 - 1e-15)\n        return - y * np.log(p) - (1 - y) * np.log(1 - p)\n\n    def acc(self, y, p):\n        return accuracy_score(np.argmax(y, axis=1), np.argmax(p, axis=1))\n\n    def gradient(self, y, p):\n        # Avoid division by zero\n        p = np.clip(p, 1e-15, 1 - 1e-15)\n        return - (y / p) + (1 - y) / (1 - p)\n\n\n"
  },
  {
    "path": "mlfromscratch/deep_learning/neural_network.py",
    "content": "from __future__ import print_function, division\nfrom terminaltables import AsciiTable\nimport numpy as np\nimport progressbar\nfrom mlfromscratch.utils import batch_iterator\nfrom mlfromscratch.utils.misc import bar_widgets\n\n\nclass NeuralNetwork():\n    \"\"\"Neural Network. Deep Learning base model.\n\n    Parameters:\n    -----------\n    optimizer: class\n        The weight optimizer that will be used to tune the weights in order of minimizing\n        the loss.\n    loss: class\n        Loss function used to measure the model's performance. SquareLoss or CrossEntropy.\n    validation: tuple\n        A tuple containing validation data and labels (X, y)\n    \"\"\"\n    def __init__(self, optimizer, loss, validation_data=None):\n        self.optimizer = optimizer\n        self.layers = []\n        self.errors = {\"training\": [], \"validation\": []}\n        self.loss_function = loss()\n        self.progressbar = progressbar.ProgressBar(widgets=bar_widgets)\n\n        self.val_set = None\n        if validation_data:\n            X, y = validation_data\n            self.val_set = {\"X\": X, \"y\": y}\n\n    def set_trainable(self, trainable):\n        \"\"\" Method which enables freezing of the weights of the network's layers. \"\"\"\n        for layer in self.layers:\n            layer.trainable = trainable\n\n    def add(self, layer):\n        \"\"\" Method which adds a layer to the neural network \"\"\"\n        # If this is not the first layer added then set the input shape\n        # to the output shape of the last added layer\n        if self.layers:\n            layer.set_input_shape(shape=self.layers[-1].output_shape())\n\n        # If the layer has weights that needs to be initialized \n        if hasattr(layer, 'initialize'):\n            layer.initialize(optimizer=self.optimizer)\n\n        # Add layer to the network\n        self.layers.append(layer)\n\n    def test_on_batch(self, X, y):\n        \"\"\" Evaluates the model over a single batch of samples \"\"\"\n        y_pred = self._forward_pass(X, training=False)\n        loss = np.mean(self.loss_function.loss(y, y_pred))\n        acc = self.loss_function.acc(y, y_pred)\n\n        return loss, acc\n\n    def train_on_batch(self, X, y):\n        \"\"\" Single gradient update over one batch of samples \"\"\"\n        y_pred = self._forward_pass(X)\n        loss = np.mean(self.loss_function.loss(y, y_pred))\n        acc = self.loss_function.acc(y, y_pred)\n        # Calculate the gradient of the loss function wrt y_pred\n        loss_grad = self.loss_function.gradient(y, y_pred)\n        # Backpropagate. Update weights\n        self._backward_pass(loss_grad=loss_grad)\n\n        return loss, acc\n\n    def fit(self, X, y, n_epochs, batch_size):\n        \"\"\" Trains the model for a fixed number of epochs \"\"\"\n        for _ in self.progressbar(range(n_epochs)):\n            \n            batch_error = []\n            for X_batch, y_batch in batch_iterator(X, y, batch_size=batch_size):\n                loss, _ = self.train_on_batch(X_batch, y_batch)\n                batch_error.append(loss)\n\n            self.errors[\"training\"].append(np.mean(batch_error))\n\n            if self.val_set is not None:\n                val_loss, _ = self.test_on_batch(self.val_set[\"X\"], self.val_set[\"y\"])\n                self.errors[\"validation\"].append(val_loss)\n\n        return self.errors[\"training\"], self.errors[\"validation\"]\n\n    def _forward_pass(self, X, training=True):\n        \"\"\" Calculate the output of the NN \"\"\"\n        layer_output = X\n        for layer in self.layers:\n            layer_output = layer.forward_pass(layer_output, training)\n\n        return layer_output\n\n    def _backward_pass(self, loss_grad):\n        \"\"\" Propagate the gradient 'backwards' and update the weights in each layer \"\"\"\n        for layer in reversed(self.layers):\n            loss_grad = layer.backward_pass(loss_grad)\n\n    def summary(self, name=\"Model Summary\"):\n        # Print model name\n        print (AsciiTable([[name]]).table)\n        # Network input shape (first layer's input shape)\n        print (\"Input Shape: %s\" % str(self.layers[0].input_shape))\n        # Iterate through network and get each layer's configuration\n        table_data = [[\"Layer Type\", \"Parameters\", \"Output Shape\"]]\n        tot_params = 0\n        for layer in self.layers:\n            layer_name = layer.layer_name()\n            params = layer.parameters()\n            out_shape = layer.output_shape()\n            table_data.append([layer_name, str(params), str(out_shape)])\n            tot_params += params\n        # Print network configuration table\n        print (AsciiTable(table_data).table)\n        print (\"Total Parameters: %d\\n\" % tot_params)\n\n    def predict(self, X):\n        \"\"\" Use the trained model to predict labels of X \"\"\"\n        return self._forward_pass(X, training=False)\n"
  },
  {
    "path": "mlfromscratch/deep_learning/optimizers.py",
    "content": "import numpy as np\nfrom mlfromscratch.utils import make_diagonal, normalize\n\n# Optimizers for models that use gradient based methods for finding the \n# weights that minimizes the loss.\n# A great resource for understanding these methods: \n# http://sebastianruder.com/optimizing-gradient-descent/index.html\n\nclass StochasticGradientDescent():\n    def __init__(self, learning_rate=0.01, momentum=0):\n        self.learning_rate = learning_rate \n        self.momentum = momentum\n        self.w_updt = None\n\n    def update(self, w, grad_wrt_w):\n        # If not initialized\n        if self.w_updt is None:\n            self.w_updt = np.zeros(np.shape(w))\n        # Use momentum if set\n        self.w_updt = self.momentum * self.w_updt + (1 - self.momentum) * grad_wrt_w\n        # Move against the gradient to minimize loss\n        return w - self.learning_rate * self.w_updt\n\nclass NesterovAcceleratedGradient():\n    def __init__(self, learning_rate=0.001, momentum=0.4):\n        self.learning_rate = learning_rate \n        self.momentum = momentum\n        self.w_updt = np.array([])\n\n    def update(self, w, grad_func):\n        # Calculate the gradient of the loss a bit further down the slope from w\n        approx_future_grad = np.clip(grad_func(w - self.momentum * self.w_updt), -1, 1)\n        # Initialize on first update\n        if not self.w_updt.any():\n            self.w_updt = np.zeros(np.shape(w))\n\n        self.w_updt = self.momentum * self.w_updt + self.learning_rate * approx_future_grad\n        # Move against the gradient to minimize loss\n        return w - self.w_updt\n\nclass Adagrad():\n    def __init__(self, learning_rate=0.01):\n        self.learning_rate = learning_rate\n        self.G = None # Sum of squares of the gradients\n        self.eps = 1e-8\n\n    def update(self, w, grad_wrt_w):\n        # If not initialized\n        if self.G is None:\n            self.G = np.zeros(np.shape(w))\n        # Add the square of the gradient of the loss function at w\n        self.G += np.power(grad_wrt_w, 2)\n        # Adaptive gradient with higher learning rate for sparse data\n        return w - self.learning_rate * grad_wrt_w / np.sqrt(self.G + self.eps)\n\nclass Adadelta():\n    def __init__(self, rho=0.95, eps=1e-6):\n        self.E_w_updt = None # Running average of squared parameter updates\n        self.E_grad = None   # Running average of the squared gradient of w\n        self.w_updt = None   # Parameter update\n        self.eps = eps\n        self.rho = rho\n\n    def update(self, w, grad_wrt_w):\n        # If not initialized\n        if self.w_updt is None:\n            self.w_updt = np.zeros(np.shape(w))\n            self.E_w_updt = np.zeros(np.shape(w))\n            self.E_grad = np.zeros(np.shape(grad_wrt_w))\n\n        # Update average of gradients at w\n        self.E_grad = self.rho * self.E_grad + (1 - self.rho) * np.power(grad_wrt_w, 2)\n        \n        RMS_delta_w = np.sqrt(self.E_w_updt + self.eps)\n        RMS_grad = np.sqrt(self.E_grad + self.eps)\n\n        # Adaptive learning rate\n        adaptive_lr = RMS_delta_w / RMS_grad\n\n        # Calculate the update\n        self.w_updt = adaptive_lr * grad_wrt_w\n\n        # Update the running average of w updates\n        self.E_w_updt = self.rho * self.E_w_updt + (1 - self.rho) * np.power(self.w_updt, 2)\n\n        return w - self.w_updt\n\nclass RMSprop():\n    def __init__(self, learning_rate=0.01, rho=0.9):\n        self.learning_rate = learning_rate\n        self.Eg = None # Running average of the square gradients at w\n        self.eps = 1e-8\n        self.rho = rho\n\n    def update(self, w, grad_wrt_w):\n        # If not initialized\n        if self.Eg is None:\n            self.Eg = np.zeros(np.shape(grad_wrt_w))\n\n        self.Eg = self.rho * self.Eg + (1 - self.rho) * np.power(grad_wrt_w, 2)\n\n        # Divide the learning rate for a weight by a running average of the magnitudes of recent\n        # gradients for that weight\n        return w - self.learning_rate *  grad_wrt_w / np.sqrt(self.Eg + self.eps)\n\nclass Adam():\n    def __init__(self, learning_rate=0.001, b1=0.9, b2=0.999):\n        self.learning_rate = learning_rate\n        self.eps = 1e-8\n        self.m = None\n        self.v = None\n        # Decay rates\n        self.b1 = b1\n        self.b2 = b2\n\n    def update(self, w, grad_wrt_w):\n        # If not initialized\n        if self.m is None:\n            self.m = np.zeros(np.shape(grad_wrt_w))\n            self.v = np.zeros(np.shape(grad_wrt_w))\n        \n        self.m = self.b1 * self.m + (1 - self.b1) * grad_wrt_w\n        self.v = self.b2 * self.v + (1 - self.b2) * np.power(grad_wrt_w, 2)\n\n        m_hat = self.m / (1 - self.b1)\n        v_hat = self.v / (1 - self.b2)\n\n        self.w_updt = self.learning_rate * m_hat / (np.sqrt(v_hat) + self.eps)\n\n        return w - self.w_updt\n\n\n\n"
  },
  {
    "path": "mlfromscratch/examples/adaboost.py",
    "content": "from __future__ import division, print_function\nimport numpy as np\nfrom sklearn import datasets\n\n# Import helper functions\nfrom mlfromscratch.supervised_learning import Adaboost\nfrom mlfromscratch.utils.data_manipulation import train_test_split\nfrom mlfromscratch.utils.data_operation import accuracy_score\nfrom mlfromscratch.utils import Plot\n\ndef main():\n    data = datasets.load_digits()\n    X = data.data\n    y = data.target\n\n    digit1 = 1\n    digit2 = 8\n    idx = np.append(np.where(y == digit1)[0], np.where(y == digit2)[0])\n    y = data.target[idx]\n    # Change labels to {-1, 1}\n    y[y == digit1] = -1\n    y[y == digit2] = 1\n    X = data.data[idx]\n\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5)\n\n    # Adaboost classification with 5 weak classifiers\n    clf = Adaboost(n_clf=5)\n    clf.fit(X_train, y_train)\n    y_pred = clf.predict(X_test)\n\n    accuracy = accuracy_score(y_test, y_pred)\n    print (\"Accuracy:\", accuracy)\n\n    # Reduce dimensions to 2d using pca and plot the results\n    Plot().plot_in_2d(X_test, y_pred, title=\"Adaboost\", accuracy=accuracy)\n\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/apriori.py",
    "content": "from __future__ import division, print_function\nimport numpy as np\n\nfrom mlfromscratch.unsupervised_learning import Apriori\n\ndef main():\n    # Demo transaction set\n    # Example 2: https://en.wikipedia.org/wiki/Apriori_algorithm\n    transactions = np.array([[1, 2, 3, 4], [1, 2, 4], [1, 2], [2, 3, 4], [2, 3], [3, 4], [2, 4]])\n    print (\"+-------------+\")\n    print (\"|   Apriori   |\")\n    print (\"+-------------+\")\n    min_sup = 0.25\n    min_conf = 0.8\n    print (\"Minimum Support: %.2f\" % (min_sup))\n    print (\"Minimum Confidence: %s\" % (min_conf))\n    print (\"Transactions:\")\n    for transaction in transactions:\n        print (\"\\t%s\" % transaction)\n\n    apriori = Apriori(min_sup=min_sup, min_conf=min_conf)\n\n    # Get and print the frequent itemsets\n    frequent_itemsets = apriori.find_frequent_itemsets(transactions)\n    print (\"Frequent Itemsets:\\n\\t%s\" % frequent_itemsets)\n\n    # Get and print the rules\n    rules = apriori.generate_rules(transactions)\n    print (\"Rules:\")\n    for rule in rules:\n        print (\"\\t%s -> %s (support: %.2f, confidence: %s)\" % (rule.antecedent, rule.concequent, rule.support, rule.confidence,))\n\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/bayesian_regression.py",
    "content": "import numpy as np\nimport pandas as pd\nimport matplotlib.pyplot as plt\n\n# Import helper functions\nfrom mlfromscratch.utils.data_operation import mean_squared_error\nfrom mlfromscratch.utils.data_manipulation import train_test_split, polynomial_features\nfrom mlfromscratch.supervised_learning import BayesianRegression\n\ndef main():\n\n    # Load temperature data\n    data = pd.read_csv('mlfromscratch/data/TempLinkoping2016.txt', sep=\"\\t\")\n\n    time = np.atleast_2d(data[\"time\"].values).T\n    temp = np.atleast_2d(data[\"temp\"].values).T\n\n    X = time # fraction of the year [0, 1]\n    y = temp\n\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4)\n\n    n_samples, n_features = np.shape(X)\n\n    # Prior parameters\n    # - Weights are assumed distr. according to a Normal distribution\n    # - The variance of the weights are assumed distributed according to \n    #   a scaled inverse chi-squared distribution.\n    # High prior uncertainty!\n    # Normal\n    mu0 = np.array([0] * n_features)\n    omega0 = np.diag([.0001] * n_features)\n    # Scaled inverse chi-squared\n    nu0 = 1\n    sigma_sq0 = 100\n\n    # The credible interval\n    cred_int = 10\n\n    clf = BayesianRegression(n_draws=2000, \n        poly_degree=4, \n        mu0=mu0, \n        omega0=omega0, \n        nu0=nu0, \n        sigma_sq0=sigma_sq0,\n        cred_int=cred_int)\n    clf.fit(X_train, y_train)\n    y_pred = clf.predict(X_test)\n\n    mse = mean_squared_error(y_test, y_pred)\n\n    # Get prediction line\n    y_pred_, y_lower_, y_upper_ = clf.predict(X=X, eti=True)\n\n    # Print the mean squared error\n    print (\"Mean Squared Error:\", mse)\n\n    # Color map\n    cmap = plt.get_cmap('viridis')\n\n    # Plot the results\n    m1 = plt.scatter(366 * X_train, y_train, color=cmap(0.9), s=10)\n    m2 = plt.scatter(366 * X_test, y_test, color=cmap(0.5), s=10)\n    p1 = plt.plot(366 * X, y_pred_, color=\"black\", linewidth=2, label=\"Prediction\")\n    p2 = plt.plot(366 * X, y_lower_, color=\"gray\", linewidth=2, label=\"{0}% Credible Interval\".format(cred_int))\n    p3 = plt.plot(366 * X, y_upper_, color=\"gray\", linewidth=2)\n    plt.axis((0, 366, -20, 25))\n    plt.suptitle(\"Bayesian Regression\")\n    plt.title(\"MSE: %.2f\" % mse, fontsize=10)\n    plt.xlabel('Day')\n    plt.ylabel('Temperature in Celcius')\n    plt.legend(loc='lower right')\n    # plt.legend((m1, m2), (\"Training data\", \"Test data\"), loc='lower right')\n    plt.legend(loc='lower right')\n\n    plt.show()\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/convolutional_neural_network.py",
    "content": "\nfrom __future__ import print_function\nfrom sklearn import datasets\nimport matplotlib.pyplot as plt\nimport math\nimport numpy as np\n\n# Import helper functions\nfrom mlfromscratch.deep_learning import NeuralNetwork\nfrom mlfromscratch.utils import train_test_split, to_categorical, normalize\nfrom mlfromscratch.utils import get_random_subsets, shuffle_data, Plot\nfrom mlfromscratch.utils.data_operation import accuracy_score\nfrom mlfromscratch.deep_learning.optimizers import StochasticGradientDescent, Adam, RMSprop, Adagrad, Adadelta\nfrom mlfromscratch.deep_learning.loss_functions import CrossEntropy\nfrom mlfromscratch.utils.misc import bar_widgets\nfrom mlfromscratch.deep_learning.layers import Dense, Dropout, Conv2D, Flatten, Activation, MaxPooling2D\nfrom mlfromscratch.deep_learning.layers import AveragePooling2D, ZeroPadding2D, BatchNormalization, RNN\n\n\n\ndef main():\n\n    #----------\n    # Conv Net\n    #----------\n\n    optimizer = Adam()\n\n    data = datasets.load_digits()\n    X = data.data\n    y = data.target\n\n    # Convert to one-hot encoding\n    y = to_categorical(y.astype(\"int\"))\n\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, seed=1)\n\n    # Reshape X to (n_samples, channels, height, width)\n    X_train = X_train.reshape((-1,1,8,8))\n    X_test = X_test.reshape((-1,1,8,8))\n\n    clf = NeuralNetwork(optimizer=optimizer,\n                        loss=CrossEntropy,\n                        validation_data=(X_test, y_test))\n\n    clf.add(Conv2D(n_filters=16, filter_shape=(3,3), stride=1, input_shape=(1,8,8), padding='same'))\n    clf.add(Activation('relu'))\n    clf.add(Dropout(0.25))\n    clf.add(BatchNormalization())\n    clf.add(Conv2D(n_filters=32, filter_shape=(3,3), stride=1, padding='same'))\n    clf.add(Activation('relu'))\n    clf.add(Dropout(0.25))\n    clf.add(BatchNormalization())\n    clf.add(Flatten())\n    clf.add(Dense(256))\n    clf.add(Activation('relu'))\n    clf.add(Dropout(0.4))\n    clf.add(BatchNormalization())\n    clf.add(Dense(10))\n    clf.add(Activation('softmax'))\n\n    print ()\n    clf.summary(name=\"ConvNet\")\n\n    train_err, val_err = clf.fit(X_train, y_train, n_epochs=50, batch_size=256)\n\n    # Training and validation error plot\n    n = len(train_err)\n    training, = plt.plot(range(n), train_err, label=\"Training Error\")\n    validation, = plt.plot(range(n), val_err, label=\"Validation Error\")\n    plt.legend(handles=[training, validation])\n    plt.title(\"Error Plot\")\n    plt.ylabel('Error')\n    plt.xlabel('Iterations')\n    plt.show()\n\n    _, accuracy = clf.test_on_batch(X_test, y_test)\n    print (\"Accuracy:\", accuracy)\n\n\n    y_pred = np.argmax(clf.predict(X_test), axis=1)\n    X_test = X_test.reshape(-1, 8*8)\n    # Reduce dimension to 2D using PCA and plot the results\n    Plot().plot_in_2d(X_test, y_pred, title=\"Convolutional Neural Network\", accuracy=accuracy, legend_labels=range(10))\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "mlfromscratch/examples/dbscan.py",
    "content": "import sys\nimport os\nimport math\nimport random\nfrom sklearn import datasets\nimport numpy as np\n\n# Import helper functions\nfrom mlfromscratch.utils import Plot\nfrom mlfromscratch.unsupervised_learning import DBSCAN\n\ndef main():\n    # Load the dataset\n    X, y = datasets.make_moons(n_samples=300, noise=0.08, shuffle=False)\n\n    # Cluster the data using DBSCAN\n    clf = DBSCAN(eps=0.17, min_samples=5)\n    y_pred = clf.predict(X)\n\n    # Project the data onto the 2 primary principal components\n    p = Plot()\n    p.plot_in_2d(X, y_pred, title=\"DBSCAN\")\n    p.plot_in_2d(X, y, title=\"Actual Clustering\")\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "mlfromscratch/examples/decision_tree_classifier.py",
    "content": "from __future__ import division, print_function\nimport numpy as np\nfrom sklearn import datasets\nimport matplotlib.pyplot as plt\nimport sys\nimport os\n\n# Import helper functions\nfrom mlfromscratch.utils import train_test_split, standardize, accuracy_score\nfrom mlfromscratch.utils import mean_squared_error, calculate_variance, Plot\nfrom mlfromscratch.supervised_learning import ClassificationTree\n\ndef main():\n\n    print (\"-- Classification Tree --\")\n\n    data = datasets.load_iris()\n    X = data.data\n    y = data.target\n\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4)\n\n    clf = ClassificationTree()\n    clf.fit(X_train, y_train)\n    y_pred = clf.predict(X_test)\n\n    accuracy = accuracy_score(y_test, y_pred)\n\n    print (\"Accuracy:\", accuracy)\n\n    Plot().plot_in_2d(X_test, y_pred, \n        title=\"Decision Tree\", \n        accuracy=accuracy, \n        legend_labels=data.target_names)\n\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/decision_tree_regressor.py",
    "content": "from __future__ import division, print_function\nimport numpy as np\nimport matplotlib.pyplot as plt\nimport pandas as pd\n\nfrom mlfromscratch.utils import train_test_split, standardize, accuracy_score\nfrom mlfromscratch.utils import mean_squared_error, calculate_variance, Plot\nfrom mlfromscratch.supervised_learning import RegressionTree\n\ndef main():\n\n    print (\"-- Regression Tree --\")\n\n    # Load temperature data\n    data = pd.read_csv('mlfromscratch/data/TempLinkoping2016.txt', sep=\"\\t\")\n\n    time = np.atleast_2d(data[\"time\"].values).T\n    temp = np.atleast_2d(data[\"temp\"].values).T\n\n    X = standardize(time)        # Time. Fraction of the year [0, 1]\n    y = temp[:, 0]  # Temperature. Reduce to one-dim\n\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)\n\n    model = RegressionTree()\n    model.fit(X_train, y_train)\n    y_pred = model.predict(X_test)\n\n    y_pred_line = model.predict(X)\n\n    # Color map\n    cmap = plt.get_cmap('viridis')\n\n    mse = mean_squared_error(y_test, y_pred)\n\n    print (\"Mean Squared Error:\", mse)\n\n    # Plot the results\n    # Plot the results\n    m1 = plt.scatter(366 * X_train, y_train, color=cmap(0.9), s=10)\n    m2 = plt.scatter(366 * X_test, y_test, color=cmap(0.5), s=10)\n    m3 = plt.scatter(366 * X_test, y_pred, color='black', s=10)\n    plt.suptitle(\"Regression Tree\")\n    plt.title(\"MSE: %.2f\" % mse, fontsize=10)\n    plt.xlabel('Day')\n    plt.ylabel('Temperature in Celcius')\n    plt.legend((m1, m2, m3), (\"Training data\", \"Test data\", \"Prediction\"), loc='lower right')\n    plt.show()\n\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/deep_q_network.py",
    "content": "from __future__ import print_function\nimport numpy as np\nfrom mlfromscratch.utils import to_categorical\nfrom mlfromscratch.deep_learning.optimizers import Adam\nfrom mlfromscratch.deep_learning.loss_functions import SquareLoss\nfrom mlfromscratch.deep_learning.layers import Dense, Dropout, Flatten, Activation, Reshape, BatchNormalization\nfrom mlfromscratch.deep_learning import NeuralNetwork\nfrom mlfromscratch.reinforcement_learning import DeepQNetwork\n\n\ndef main():\n    dqn = DeepQNetwork(env_name='CartPole-v1',\n                        epsilon=0.9, \n                        gamma=0.8, \n                        decay_rate=0.005, \n                        min_epsilon=0.1)\n\n    # Model builder\n    def model(n_inputs, n_outputs):    \n        clf = NeuralNetwork(optimizer=Adam(), loss=SquareLoss)\n        clf.add(Dense(64, input_shape=(n_inputs,)))\n        clf.add(Activation('relu'))\n        clf.add(Dense(n_outputs))\n        return clf\n\n    dqn.set_model(model)\n\n    print ()\n    dqn.model.summary(name=\"Deep Q-Network\")\n\n    dqn.train(n_epochs=500)\n    dqn.play(n_epochs=100)\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/demo.py",
    "content": "from __future__ import print_function\nfrom sklearn import datasets\nimport numpy as np\nimport math\nimport matplotlib.pyplot as plt\n\nfrom mlfromscratch.utils import train_test_split, normalize, to_categorical, accuracy_score\nfrom mlfromscratch.deep_learning.optimizers import Adam\nfrom mlfromscratch.deep_learning.loss_functions import CrossEntropy\nfrom mlfromscratch.deep_learning.activation_functions import Softmax\nfrom mlfromscratch.utils.kernels import *\nfrom mlfromscratch.supervised_learning import *\nfrom mlfromscratch.deep_learning import *\nfrom mlfromscratch.unsupervised_learning import PCA\nfrom mlfromscratch.deep_learning.layers import Dense, Dropout, Conv2D, Flatten, Activation\n\n\nprint (\"+-------------------------------------------+\")\nprint (\"|                                           |\")\nprint (\"|       Machine Learning From Scratch       |\")\nprint (\"|                                           |\")\nprint (\"+-------------------------------------------+\")\n\n\n# ...........\n#  LOAD DATA\n# ...........\ndata = datasets.load_digits()\ndigit1 = 1\ndigit2 = 8\nidx = np.append(np.where(data.target == digit1)[0], np.where(data.target == digit2)[0])\ny = data.target[idx]\n# Change labels to {0, 1}\ny[y == digit1] = 0\ny[y == digit2] = 1\nX = data.data[idx]\nX = normalize(X)\n\nprint (\"Dataset: The Digit Dataset (digits %s and %s)\" % (digit1, digit2))\n\n# ..........................\n#  DIMENSIONALITY REDUCTION\n# ..........................\npca = PCA()\nX = pca.transform(X, n_components=5) # Reduce to 5 dimensions\n\nn_samples, n_features = np.shape(X)\n\n# ..........................\n#  TRAIN / TEST SPLIT\n# ..........................\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5)\n# Rescaled labels {-1, 1}\nrescaled_y_train = 2*y_train - np.ones(np.shape(y_train))\nrescaled_y_test = 2*y_test - np.ones(np.shape(y_test))\n\n# .......\n#  SETUP\n# .......\nadaboost = Adaboost(n_clf = 8)\nnaive_bayes = NaiveBayes()\nknn = KNN(k=4)\nlogistic_regression = LogisticRegression()\nmlp = NeuralNetwork(optimizer=Adam(), \n                    loss=CrossEntropy)\nmlp.add(Dense(input_shape=(n_features,), n_units=64))\nmlp.add(Activation('relu'))\nmlp.add(Dense(n_units=64))\nmlp.add(Activation('relu'))\nmlp.add(Dense(n_units=2))   \nmlp.add(Activation('softmax'))\nperceptron = Perceptron()\ndecision_tree = ClassificationTree()\nrandom_forest = RandomForest(n_estimators=50)\nsupport_vector_machine = SupportVectorMachine()\nlda = LDA()\ngbc = GradientBoostingClassifier(n_estimators=50, learning_rate=.9, max_depth=2)\nxgboost = XGBoost(n_estimators=50, learning_rate=0.5)\n\n# ........\n#  TRAIN\n# ........\nprint (\"Training:\")\nprint (\"- Adaboost\")\nadaboost.fit(X_train, rescaled_y_train)\nprint (\"- Decision Tree\")\ndecision_tree.fit(X_train, y_train)\nprint (\"- Gradient Boosting\")\ngbc.fit(X_train, y_train)\nprint (\"- LDA\")\nlda.fit(X_train, y_train)\nprint (\"- Logistic Regression\")\nlogistic_regression.fit(X_train, y_train)\nprint (\"- Multilayer Perceptron\")\nmlp.fit(X_train, to_categorical(y_train), n_epochs=300, batch_size=50)\nprint (\"- Naive Bayes\")\nnaive_bayes.fit(X_train, y_train)\nprint (\"- Perceptron\")\nperceptron.fit(X_train, to_categorical(y_train))\nprint (\"- Random Forest\")\nrandom_forest.fit(X_train, y_train)\nprint (\"- Support Vector Machine\")\nsupport_vector_machine.fit(X_train, rescaled_y_train)\nprint (\"- XGBoost\")\nxgboost.fit(X_train, y_train)\n\n\n\n# .........\n#  PREDICT\n# .........\ny_pred = {}\ny_pred[\"Adaboost\"] = adaboost.predict(X_test)\ny_pred[\"Gradient Boosting\"] = gbc.predict(X_test)\ny_pred[\"Naive Bayes\"] = naive_bayes.predict(X_test)\ny_pred[\"K Nearest Neighbors\"] = knn.predict(X_test, X_train, y_train)\ny_pred[\"Logistic Regression\"] = logistic_regression.predict(X_test)\ny_pred[\"LDA\"] = lda.predict(X_test)\ny_pred[\"Multilayer Perceptron\"] = np.argmax(mlp.predict(X_test), axis=1)\ny_pred[\"Perceptron\"] = np.argmax(perceptron.predict(X_test), axis=1)\ny_pred[\"Decision Tree\"] = decision_tree.predict(X_test)\ny_pred[\"Random Forest\"] = random_forest.predict(X_test)\ny_pred[\"Support Vector Machine\"] = support_vector_machine.predict(X_test)\ny_pred[\"XGBoost\"] = xgboost.predict(X_test)\n\n# ..........\n#  ACCURACY\n# ..........\nprint (\"Accuracy:\")\nfor clf in y_pred:\n    # Rescaled {-1 1}\n    if clf == \"Adaboost\" or clf == \"Support Vector Machine\":\n        print (\"\\t%-23s: %.5f\" %(clf, accuracy_score(rescaled_y_test, y_pred[clf])))\n    # Categorical\n    else:\n        print (\"\\t%-23s: %.5f\" %(clf, accuracy_score(y_test, y_pred[clf])))\n\n# .......\n#  PLOT\n# .......\nplt.scatter(X_test[:,0], X_test[:,1], c=y_test)\nplt.ylabel(\"Principal Component 2\")\nplt.xlabel(\"Principal Component 1\")\nplt.title(\"The Digit Dataset (digits %s and %s)\" % (digit1, digit2))\nplt.show()\n\n\n"
  },
  {
    "path": "mlfromscratch/examples/elastic_net.py",
    "content": "from __future__ import print_function\nimport matplotlib.pyplot as plt\nimport numpy as np\nimport pandas as pd\n# Import helper functions\nfrom mlfromscratch.supervised_learning import ElasticNet\nfrom mlfromscratch.utils import k_fold_cross_validation_sets, normalize, mean_squared_error\nfrom mlfromscratch.utils import train_test_split, polynomial_features, Plot\n\n\ndef main():\n\n    # Load temperature data\n    data = pd.read_csv('mlfromscratch/data/TempLinkoping2016.txt', sep=\"\\t\")\n\n    time = np.atleast_2d(data[\"time\"].values).T\n    temp = data[\"temp\"].values\n\n    X = time # fraction of the year [0, 1]\n    y = temp\n\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4)\n\n    poly_degree = 13\n\n    model = ElasticNet(degree=15, \n                        reg_factor=0.01,\n                        l1_ratio=0.7,\n                        learning_rate=0.001,\n                        n_iterations=4000)\n    model.fit(X_train, y_train)\n\n    # Training error plot\n    n = len(model.training_errors)\n    training, = plt.plot(range(n), model.training_errors, label=\"Training Error\")\n    plt.legend(handles=[training])\n    plt.title(\"Error Plot\")\n    plt.ylabel('Mean Squared Error')\n    plt.xlabel('Iterations')\n    plt.show()\n\n    y_pred = model.predict(X_test)\n    mse = mean_squared_error(y_test, y_pred)\n    print (\"Mean squared error: %s (given by reg. factor: %s)\" % (mse, 0.05))\n\n    y_pred_line = model.predict(X)\n\n    # Color map\n    cmap = plt.get_cmap('viridis')\n\n    # Plot the results\n    m1 = plt.scatter(366 * X_train, y_train, color=cmap(0.9), s=10)\n    m2 = plt.scatter(366 * X_test, y_test, color=cmap(0.5), s=10)\n    plt.plot(366 * X, y_pred_line, color='black', linewidth=2, label=\"Prediction\")\n    plt.suptitle(\"Elastic Net\")\n    plt.title(\"MSE: %.2f\" % mse, fontsize=10)\n    plt.xlabel('Day')\n    plt.ylabel('Temperature in Celcius')\n    plt.legend((m1, m2), (\"Training data\", \"Test data\"), loc='lower right')\n    plt.show()\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "mlfromscratch/examples/fp_growth.py",
    "content": "\nimport numpy as np\nfrom mlfromscratch.unsupervised_learning import FPGrowth\n\ndef main():\n    # Demo transaction set\n    # Example:\n    # https://en.wikibooks.org/wiki/Data_Mining_Algorithms_In_R/Frequent_Pattern_Mining/The_FP-Growth_Algorithm\n    \n    transactions = np.array([\n        [\"A\", \"B\", \"D\", \"E\"],\n        [\"B\", \"C\", \"E\"],\n        [\"A\", \"B\", \"D\", \"E\"],\n        [\"A\", \"B\", \"C\", \"E\"],\n        [\"A\", \"B\", \"C\", \"D\", \"E\"],\n        [\"B\", \"C\", \"D\"]\n    ])\n\n    print (\"\")\n    print (\"+---------------+\")\n    print (\"|   FP-Growth   |\")\n    print (\"+---------------+\")\n    min_sup = 3\n    print (\"Minimum Support: %s\" % min_sup)\n    print (\"\")\n    print (\"Transactions:\")\n    for transaction in transactions:\n        print (\"\\t%s\" % transaction)\n\n    fp_growth = FPGrowth(min_sup=min_sup)\n\n    print (\"\")\n    # Get and print the frequent itemsets\n    frequent_itemsets = fp_growth.find_frequent_itemsets(\n        transactions, show_tree=True)\n\n    print (\"\")\n    print (\"Frequent itemsets:\")\n    for itemset in frequent_itemsets:\n        print (\"\\t%s\" % itemset)\n    print (\"\")\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/gaussian_mixture_model.py",
    "content": "from __future__ import division, print_function\nimport sys\nimport os\nimport math\nimport random\nfrom sklearn import datasets\nimport numpy as np\n\nfrom mlfromscratch.unsupervised_learning import GaussianMixtureModel\nfrom mlfromscratch.utils import Plot\n\n\ndef main():\n    # Load the dataset\n    X, y = datasets.make_blobs()\n\n    # Cluster the data\n    clf = GaussianMixtureModel(k=3)\n    y_pred = clf.predict(X)\n\n    p = Plot()\n    p.plot_in_2d(X, y_pred, title=\"GMM Clustering\")\n    p.plot_in_2d(X, y, title=\"Actual Clustering\")\n\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/genetic_algorithm.py",
    "content": "\nfrom mlfromscratch.unsupervised_learning import GeneticAlgorithm\n\ndef main():\n    target_string = \"Genetic Algorithm\"\n    population_size = 100\n    mutation_rate = 0.05\n    genetic_algorithm = GeneticAlgorithm(target_string,\n                                        population_size,\n                                        mutation_rate)\n\n    print (\"\")\n    print (\"+--------+\")\n    print (\"|   GA   |\")\n    print (\"+--------+\")\n    print (\"Description: Implementation of a Genetic Algorithm which aims to produce\")\n    print (\"the user specified target string. This implementation calculates each\")\n    print (\"candidate's fitness based on the alphabetical distance between the candidate\")\n    print (\"and the target. A candidate is selected as a parent with probabilities proportional\")\n    print (\"to the candidate's fitness. Reproduction is implemented as a single-point\")\n    print (\"crossover between pairs of parents. Mutation is done by randomly assigning\")\n    print (\"new characters with uniform probability.\")\n    print (\"\")\n    print (\"Parameters\")\n    print (\"----------\")\n    print (\"Target String: '%s'\" % target_string)\n    print (\"Population Size: %d\" % population_size)\n    print (\"Mutation Rate: %s\" % mutation_rate)\n    print (\"\")\n\n    genetic_algorithm.run(iterations=1000)\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/gradient_boosting_classifier.py",
    "content": "from __future__ import division, print_function\nimport numpy as np\nfrom sklearn import datasets\nimport matplotlib.pyplot as plt\n\n# Import helper functions\nfrom mlfromscratch.utils import train_test_split, accuracy_score\nfrom mlfromscratch.deep_learning.loss_functions import CrossEntropy\nfrom mlfromscratch.utils import Plot\nfrom mlfromscratch.supervised_learning import GradientBoostingClassifier\n\ndef main():\n\n    print (\"-- Gradient Boosting Classification --\")\n\n    data = datasets.load_iris()\n    X = data.data\n    y = data.target\n\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4)\n\n    clf = GradientBoostingClassifier()\n    clf.fit(X_train, y_train)\n    y_pred = clf.predict(X_test)\n\n    accuracy = accuracy_score(y_test, y_pred)\n\n    print (\"Accuracy:\", accuracy)\n\n\n    Plot().plot_in_2d(X_test, y_pred, \n        title=\"Gradient Boosting\", \n        accuracy=accuracy, \n        legend_labels=data.target_names)\n\n\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/gradient_boosting_regressor.py",
    "content": "from __future__ import division, print_function\nimport numpy as np\nimport pandas as pd\nimport matplotlib.pyplot as plt\nimport progressbar\n\nfrom mlfromscratch.utils import train_test_split, standardize, to_categorical\nfrom mlfromscratch.utils import mean_squared_error, accuracy_score, Plot\nfrom mlfromscratch.utils.loss_functions import SquareLoss\nfrom mlfromscratch.utils.misc import bar_widgets\nfrom mlfromscratch.supervised_learning import GradientBoostingRegressor\n\n\ndef main():\n    print (\"-- Gradient Boosting Regression --\")\n\n    # Load temperature data\n    data = pd.read_csv('mlfromscratch/data/TempLinkoping2016.txt', sep=\"\\t\")\n\n    time = np.atleast_2d(data[\"time\"].values).T\n    temp = np.atleast_2d(data[\"temp\"].values).T\n\n    X = time.reshape((-1, 1))               # Time. Fraction of the year [0, 1]\n    X = np.insert(X, 0, values=1, axis=1)   # Insert bias term\n    y = temp[:, 0]                          # Temperature. Reduce to one-dim\n\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5)\n\n    model = GradientBoostingRegressor()\n    model.fit(X_train, y_train)\n    y_pred = model.predict(X_test)\n\n    y_pred_line = model.predict(X)\n\n    # Color map\n    cmap = plt.get_cmap('viridis')\n\n    mse = mean_squared_error(y_test, y_pred)\n\n    print (\"Mean Squared Error:\", mse)\n\n    # Plot the results\n    m1 = plt.scatter(366 * X_train[:, 1], y_train, color=cmap(0.9), s=10)\n    m2 = plt.scatter(366 * X_test[:, 1], y_test, color=cmap(0.5), s=10)\n    m3 = plt.scatter(366 * X_test[:, 1], y_pred, color='black', s=10)\n    plt.suptitle(\"Regression Tree\")\n    plt.title(\"MSE: %.2f\" % mse, fontsize=10)\n    plt.xlabel('Day')\n    plt.ylabel('Temperature in Celcius')\n    plt.legend((m1, m2, m3), (\"Training data\", \"Test data\", \"Prediction\"), loc='lower right')\n    plt.show()\n\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/k_means.py",
    "content": "from __future__ import division, print_function\nfrom sklearn import datasets\nimport numpy as np\n\nfrom mlfromscratch.unsupervised_learning import KMeans\nfrom mlfromscratch.utils import Plot\n\n\ndef main():\n    # Load the dataset\n    X, y = datasets.make_blobs()\n\n    # Cluster the data using K-Means\n    clf = KMeans(k=3)\n    y_pred = clf.predict(X)\n\n    # Project the data onto the 2 primary principal components\n    p = Plot()\n    p.plot_in_2d(X, y_pred, title=\"K-Means Clustering\")\n    p.plot_in_2d(X, y, title=\"Actual Clustering\")\n\n\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/k_nearest_neighbors.py",
    "content": "from __future__ import print_function\nimport numpy as np\nimport matplotlib.pyplot as plt\nfrom sklearn import datasets\n\nfrom mlfromscratch.utils import train_test_split, normalize, accuracy_score\nfrom mlfromscratch.utils import euclidean_distance, Plot\nfrom mlfromscratch.supervised_learning import KNN\n\ndef main():\n    data = datasets.load_iris()\n    X = normalize(data.data)\n    y = data.target\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)\n\n    clf = KNN(k=5)\n    y_pred = clf.predict(X_test, X_train, y_train)\n    \n    accuracy = accuracy_score(y_test, y_pred)\n\n    print (\"Accuracy:\", accuracy)\n\n    # Reduce dimensions to 2d using pca and plot the results\n    Plot().plot_in_2d(X_test, y_pred, title=\"K Nearest Neighbors\", accuracy=accuracy, legend_labels=data.target_names)\n\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/lasso_regression.py",
    "content": "from __future__ import print_function\nimport matplotlib.pyplot as plt\nimport numpy as np\nimport pandas as pd\n# Import helper functions\nfrom mlfromscratch.supervised_learning import LassoRegression\nfrom mlfromscratch.utils import k_fold_cross_validation_sets, normalize, mean_squared_error\nfrom mlfromscratch.utils import train_test_split, polynomial_features, Plot\n\n\ndef main():\n\n    # Load temperature data\n    data = pd.read_csv('mlfromscratch/data/TempLinkoping2016.txt', sep=\"\\t\")\n\n    time = np.atleast_2d(data[\"time\"].values).T\n    temp = data[\"temp\"].values\n\n    X = time # fraction of the year [0, 1]\n    y = temp\n\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4)\n\n    poly_degree = 13\n\n    model = LassoRegression(degree=15, \n                            reg_factor=0.05,\n                            learning_rate=0.001,\n                            n_iterations=4000)\n    model.fit(X_train, y_train)\n\n    # Training error plot\n    n = len(model.training_errors)\n    training, = plt.plot(range(n), model.training_errors, label=\"Training Error\")\n    plt.legend(handles=[training])\n    plt.title(\"Error Plot\")\n    plt.ylabel('Mean Squared Error')\n    plt.xlabel('Iterations')\n    plt.show()\n\n    y_pred = model.predict(X_test)\n    mse = mean_squared_error(y_test, y_pred)\n    print (\"Mean squared error: %s (given by reg. factor: %s)\" % (mse, 0.05))\n\n    y_pred_line = model.predict(X)\n\n    # Color map\n    cmap = plt.get_cmap('viridis')\n\n    # Plot the results\n    m1 = plt.scatter(366 * X_train, y_train, color=cmap(0.9), s=10)\n    m2 = plt.scatter(366 * X_test, y_test, color=cmap(0.5), s=10)\n    plt.plot(366 * X, y_pred_line, color='black', linewidth=2, label=\"Prediction\")\n    plt.suptitle(\"Lasso Regression\")\n    plt.title(\"MSE: %.2f\" % mse, fontsize=10)\n    plt.xlabel('Day')\n    plt.ylabel('Temperature in Celcius')\n    plt.legend((m1, m2), (\"Training data\", \"Test data\"), loc='lower right')\n    plt.show()\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "mlfromscratch/examples/linear_discriminant_analysis.py",
    "content": "from __future__ import print_function\nfrom sklearn import datasets\nimport matplotlib.pyplot as plt\nimport numpy as np\n\nfrom mlfromscratch.supervised_learning import LDA\nfrom mlfromscratch.utils import calculate_covariance_matrix, accuracy_score\nfrom mlfromscratch.utils import normalize, standardize, train_test_split, Plot\nfrom mlfromscratch.unsupervised_learning import PCA\n\ndef main():\n    # Load the dataset\n    data = datasets.load_iris()\n    X = data.data\n    y = data.target\n\n    # Three -> two classes\n    X = X[y != 2]\n    y = y[y != 2]\n\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)\n\n    # Fit and predict using LDA\n    lda = LDA()\n    lda.fit(X_train, y_train)\n    y_pred = lda.predict(X_test)\n\n    accuracy = accuracy_score(y_test, y_pred)\n\n    print (\"Accuracy:\", accuracy)\n\n    Plot().plot_in_2d(X_test, y_pred, title=\"LDA\", accuracy=accuracy)\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "mlfromscratch/examples/linear_regression.py",
    "content": "import numpy as np\nimport pandas as pd\nimport matplotlib.pyplot as plt\nfrom sklearn.datasets import make_regression\n\nfrom mlfromscratch.utils import train_test_split, polynomial_features\nfrom mlfromscratch.utils import mean_squared_error, Plot\nfrom mlfromscratch.supervised_learning import LinearRegression\n\ndef main():\n\n    X, y = make_regression(n_samples=100, n_features=1, noise=20)\n\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4)\n\n    n_samples, n_features = np.shape(X)\n\n    model = LinearRegression(n_iterations=100)\n\n    model.fit(X_train, y_train)\n    \n    # Training error plot\n    n = len(model.training_errors)\n    training, = plt.plot(range(n), model.training_errors, label=\"Training Error\")\n    plt.legend(handles=[training])\n    plt.title(\"Error Plot\")\n    plt.ylabel('Mean Squared Error')\n    plt.xlabel('Iterations')\n    plt.show()\n\n    y_pred = model.predict(X_test)\n    mse = mean_squared_error(y_test, y_pred)\n    print (\"Mean squared error: %s\" % (mse))\n\n    y_pred_line = model.predict(X)\n\n    # Color map\n    cmap = plt.get_cmap('viridis')\n\n    # Plot the results\n    m1 = plt.scatter(366 * X_train, y_train, color=cmap(0.9), s=10)\n    m2 = plt.scatter(366 * X_test, y_test, color=cmap(0.5), s=10)\n    plt.plot(366 * X, y_pred_line, color='black', linewidth=2, label=\"Prediction\")\n    plt.suptitle(\"Linear Regression\")\n    plt.title(\"MSE: %.2f\" % mse, fontsize=10)\n    plt.xlabel('Day')\n    plt.ylabel('Temperature in Celcius')\n    plt.legend((m1, m2), (\"Training data\", \"Test data\"), loc='lower right')\n    plt.show()\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/logistic_regression.py",
    "content": "from __future__ import print_function\nfrom sklearn import datasets\nimport numpy as np\nimport matplotlib.pyplot as plt\n\n# Import helper functions\nfrom mlfromscratch.utils import make_diagonal, normalize, train_test_split, accuracy_score\nfrom mlfromscratch.deep_learning.activation_functions import Sigmoid\nfrom mlfromscratch.utils import Plot\nfrom mlfromscratch.supervised_learning import LogisticRegression\n\ndef main():\n    # Load dataset\n    data = datasets.load_iris()\n    X = normalize(data.data[data.target != 0])\n    y = data.target[data.target != 0]\n    y[y == 1] = 0\n    y[y == 2] = 1\n\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, seed=1)\n\n    clf = LogisticRegression(gradient_descent=True)\n    clf.fit(X_train, y_train)\n    y_pred = clf.predict(X_test)\n\n    accuracy = accuracy_score(y_test, y_pred)\n    print (\"Accuracy:\", accuracy)\n\n    # Reduce dimension to two using PCA and plot the results\n    Plot().plot_in_2d(X_test, y_pred, title=\"Logistic Regression\", accuracy=accuracy)\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/multi_class_lda.py",
    "content": "from __future__ import print_function\nfrom sklearn import datasets\nimport numpy as np\n\nfrom mlfromscratch.supervised_learning import MultiClassLDA\nfrom mlfromscratch.utils import normalize\n\ndef main():\n    # Load the dataset\n    data = datasets.load_iris()\n    X = normalize(data.data)\n    y = data.target\n\n    # Project the data onto the 2 primary components\n    multi_class_lda = MultiClassLDA()\n    multi_class_lda.plot_in_2d(X, y, title=\"LDA\")\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/multilayer_perceptron.py",
    "content": "\nfrom __future__ import print_function\nfrom sklearn import datasets\nimport matplotlib.pyplot as plt\nimport numpy as np\n\n# Import helper functions\nfrom mlfromscratch.deep_learning import NeuralNetwork\nfrom mlfromscratch.utils import train_test_split, to_categorical, normalize, Plot\nfrom mlfromscratch.utils import get_random_subsets, shuffle_data, accuracy_score\nfrom mlfromscratch.deep_learning.optimizers import StochasticGradientDescent, Adam, RMSprop, Adagrad, Adadelta\nfrom mlfromscratch.deep_learning.loss_functions import CrossEntropy\nfrom mlfromscratch.utils.misc import bar_widgets\nfrom mlfromscratch.deep_learning.layers import Dense, Dropout, Activation\n\n\ndef main():\n\n    optimizer = Adam()\n\n    #-----\n    # MLP\n    #-----\n\n    data = datasets.load_digits()\n    X = data.data\n    y = data.target\n\n    # Convert to one-hot encoding\n    y = to_categorical(y.astype(\"int\"))\n\n    n_samples, n_features = X.shape\n    n_hidden = 512\n\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, seed=1)\n\n    clf = NeuralNetwork(optimizer=optimizer,\n                        loss=CrossEntropy,\n                        validation_data=(X_test, y_test))\n\n    clf.add(Dense(n_hidden, input_shape=(n_features,)))\n    clf.add(Activation('leaky_relu'))\n    clf.add(Dense(n_hidden))\n    clf.add(Activation('leaky_relu'))\n    clf.add(Dropout(0.25))\n    clf.add(Dense(n_hidden))\n    clf.add(Activation('leaky_relu'))\n    clf.add(Dropout(0.25))\n    clf.add(Dense(n_hidden))\n    clf.add(Activation('leaky_relu'))\n    clf.add(Dropout(0.25))\n    clf.add(Dense(10))\n    clf.add(Activation('softmax'))\n\n    print ()\n    clf.summary(name=\"MLP\")\n    \n    train_err, val_err = clf.fit(X_train, y_train, n_epochs=50, batch_size=256)\n    \n    # Training and validation error plot\n    n = len(train_err)\n    training, = plt.plot(range(n), train_err, label=\"Training Error\")\n    validation, = plt.plot(range(n), val_err, label=\"Validation Error\")\n    plt.legend(handles=[training, validation])\n    plt.title(\"Error Plot\")\n    plt.ylabel('Error')\n    plt.xlabel('Iterations')\n    plt.show()\n\n    _, accuracy = clf.test_on_batch(X_test, y_test)\n    print (\"Accuracy:\", accuracy)\n\n    # Reduce dimension to 2D using PCA and plot the results\n    y_pred = np.argmax(clf.predict(X_test), axis=1)\n    Plot().plot_in_2d(X_test, y_pred, title=\"Multilayer Perceptron\", accuracy=accuracy, legend_labels=range(10))\n\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/naive_bayes.py",
    "content": "from __future__ import division, print_function\nfrom sklearn import datasets\nimport numpy as np\nfrom mlfromscratch.utils import train_test_split, normalize, accuracy_score, Plot\nfrom mlfromscratch.supervised_learning import NaiveBayes\n\ndef main():\n    data = datasets.load_digits()\n    X = normalize(data.data)\n    y = data.target\n\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4)\n\n    clf = NaiveBayes()\n    clf.fit(X_train, y_train)\n    y_pred = clf.predict(X_test)\n\n    accuracy = accuracy_score(y_test, y_pred)\n\n    print (\"Accuracy:\", accuracy)\n\n    # Reduce dimension to two using PCA and plot the results\n    Plot().plot_in_2d(X_test, y_pred, title=\"Naive Bayes\", accuracy=accuracy, legend_labels=data.target_names)\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/neuroevolution.py",
    "content": "\nfrom __future__ import print_function\nfrom sklearn import datasets\nimport matplotlib.pyplot as plt\nimport numpy as np\n\nfrom mlfromscratch.supervised_learning import Neuroevolution\nfrom mlfromscratch.utils import train_test_split, to_categorical, normalize, Plot\nfrom mlfromscratch.deep_learning import NeuralNetwork\nfrom mlfromscratch.deep_learning.layers import Activation, Dense\nfrom mlfromscratch.deep_learning.loss_functions import CrossEntropy\nfrom mlfromscratch.deep_learning.optimizers import Adam\n\ndef main():\n\n    X, y = datasets.make_classification(n_samples=1000, n_features=10, n_classes=4, n_clusters_per_class=1, n_informative=2)\n\n    data = datasets.load_digits()\n    X = normalize(data.data)\n    y = data.target\n    y = to_categorical(y.astype(\"int\"))\n\n    # Model builder\n    def model_builder(n_inputs, n_outputs):    \n        model = NeuralNetwork(optimizer=Adam(), loss=CrossEntropy)\n        model.add(Dense(16, input_shape=(n_inputs,)))\n        model.add(Activation('relu'))\n        model.add(Dense(n_outputs))\n        model.add(Activation('softmax'))\n\n        return model\n\n    # Print the model summary of a individual in the population\n    print (\"\")\n    model_builder(n_inputs=X.shape[1], n_outputs=y.shape[1]).summary()\n\n    population_size = 100\n    n_generations = 3000\n    mutation_rate = 0.01\n\n    print (\"Population Size: %d\" % population_size)\n    print (\"Generations: %d\" % n_generations)\n    print (\"Mutation Rate: %.2f\" % mutation_rate)\n    print (\"\")\n\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, seed=1)\n\n    model = Neuroevolution(population_size=population_size, \n                        mutation_rate=mutation_rate, \n                        model_builder=model_builder)\n    \n    model = model.evolve(X_train, y_train, n_generations=n_generations)\n\n    loss, accuracy = model.test_on_batch(X_test, y_test)\n\n    # Reduce dimension to 2D using PCA and plot the results\n    y_pred = np.argmax(model.predict(X_test), axis=1)\n    Plot().plot_in_2d(X_test, y_pred, title=\"Evolutionary Evolved Neural Network\", accuracy=accuracy, legend_labels=range(y.shape[1]))\n\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/particle_swarm_optimization.py",
    "content": "\nfrom __future__ import print_function\nfrom sklearn import datasets\nimport matplotlib.pyplot as plt\nimport numpy as np\n\nfrom mlfromscratch.supervised_learning import ParticleSwarmOptimizedNN\nfrom mlfromscratch.utils import train_test_split, to_categorical, normalize, Plot\nfrom mlfromscratch.deep_learning import NeuralNetwork\nfrom mlfromscratch.deep_learning.layers import Activation, Dense\nfrom mlfromscratch.deep_learning.loss_functions import CrossEntropy\nfrom mlfromscratch.deep_learning.optimizers import Adam\n\ndef main():\n\n    X, y = datasets.make_classification(n_samples=1000, n_features=10, n_classes=4, n_clusters_per_class=1, n_informative=2)\n\n    data = datasets.load_iris()\n    X = normalize(data.data)\n    y = data.target\n    y = to_categorical(y.astype(\"int\"))\n\n    # Model builder\n    def model_builder(n_inputs, n_outputs):    \n        model = NeuralNetwork(optimizer=Adam(), loss=CrossEntropy)\n        model.add(Dense(16, input_shape=(n_inputs,)))\n        model.add(Activation('relu'))\n        model.add(Dense(n_outputs))\n        model.add(Activation('softmax'))\n\n        return model\n\n    # Print the model summary of a individual in the population\n    print (\"\")\n    model_builder(n_inputs=X.shape[1], n_outputs=y.shape[1]).summary()\n\n    population_size = 100\n    n_generations = 10\n\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, seed=1)\n\n    inertia_weight = 0.8\n    cognitive_weight = 0.8\n    social_weight = 0.8\n\n    print (\"Population Size: %d\" % population_size)\n    print (\"Generations: %d\" % n_generations)\n    print (\"\")\n    print (\"Inertia Weight: %.2f\" % inertia_weight)\n    print (\"Cognitive Weight: %.2f\" % cognitive_weight)\n    print (\"Social Weight: %.2f\" % social_weight)\n    print (\"\")\n\n    model = ParticleSwarmOptimizedNN(population_size=population_size, \n                        inertia_weight=inertia_weight,\n                        cognitive_weight=cognitive_weight,\n                        social_weight=social_weight,\n                        max_velocity=5,\n                        model_builder=model_builder)\n    \n    model = model.evolve(X_train, y_train, n_generations=n_generations)\n\n    loss, accuracy = model.test_on_batch(X_test, y_test)\n\n    print (\"Accuracy: %.1f%%\" % float(100*accuracy))\n\n    # Reduce dimension to 2D using PCA and plot the results\n    y_pred = np.argmax(model.predict(X_test), axis=1)\n    Plot().plot_in_2d(X_test, y_pred, title=\"Particle Swarm Optimized Neural Network\", accuracy=accuracy, legend_labels=range(y.shape[1]))\n\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/partitioning_around_medoids.py",
    "content": "from sklearn import datasets\nimport numpy as np\n\n# Import helper functions\nfrom mlfromscratch.utils import Plot\nfrom mlfromscratch.unsupervised_learning import PAM\n\ndef main():\n    # Load the dataset\n    X, y = datasets.make_blobs()\n\n    # Cluster the data using K-Medoids\n    clf = PAM(k=3)\n    y_pred = clf.predict(X)\n\n    # Project the data onto the 2 primary principal components\n    p = Plot()\n    p.plot_in_2d(X, y_pred, title=\"PAM Clustering\")\n    p.plot_in_2d(X, y, title=\"Actual Clustering\")\n\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/perceptron.py",
    "content": "from __future__ import print_function\nfrom sklearn import datasets\nimport numpy as np\n\n# Import helper functions\nfrom mlfromscratch.utils import train_test_split, normalize, to_categorical, accuracy_score\nfrom mlfromscratch.deep_learning.activation_functions import Sigmoid\nfrom mlfromscratch.deep_learning.loss_functions import CrossEntropy \nfrom mlfromscratch.utils import Plot\nfrom mlfromscratch.supervised_learning import Perceptron\n\n\ndef main():\n    data = datasets.load_digits()\n    X = normalize(data.data)\n    y = data.target\n\n    # One-hot encoding of nominal y-values\n    y = to_categorical(y)\n\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, seed=1)\n\n    # Perceptron\n    clf = Perceptron(n_iterations=5000,\n        learning_rate=0.001, \n        loss=CrossEntropy,\n        activation_function=Sigmoid)\n    clf.fit(X_train, y_train)\n\n    y_pred = np.argmax(clf.predict(X_test), axis=1)\n    y_test = np.argmax(y_test, axis=1)\n\n    accuracy = accuracy_score(y_test, y_pred)\n\n    print (\"Accuracy:\", accuracy)\n\n    # Reduce dimension to two using PCA and plot the results\n    Plot().plot_in_2d(X_test, y_pred, title=\"Perceptron\", accuracy=accuracy, legend_labels=np.unique(y))\n\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/polynomial_regression.py",
    "content": "from __future__ import print_function\nimport matplotlib.pyplot as plt\nimport numpy as np\nimport pandas as pd\n# Import helper functions\nfrom mlfromscratch.supervised_learning import PolynomialRidgeRegression\nfrom mlfromscratch.utils import k_fold_cross_validation_sets, normalize, mean_squared_error\nfrom mlfromscratch.utils import train_test_split, polynomial_features, Plot\n\n\ndef main():\n\n    # Load temperature data\n    data = pd.read_csv('mlfromscratch/data/TempLinkoping2016.txt', sep=\"\\t\")\n\n    time = np.atleast_2d(data[\"time\"].values).T\n    temp = data[\"temp\"].values\n\n    X = time # fraction of the year [0, 1]\n    y = temp\n\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4)\n\n    poly_degree = 15\n\n    # Finding regularization constant using cross validation\n    lowest_error = float(\"inf\")\n    best_reg_factor = None\n    print (\"Finding regularization constant using cross validation:\")\n    k = 10\n    for reg_factor in np.arange(0, 0.1, 0.01):\n        cross_validation_sets = k_fold_cross_validation_sets(\n            X_train, y_train, k=k)\n        mse = 0\n        for _X_train, _X_test, _y_train, _y_test in cross_validation_sets:\n            model = PolynomialRidgeRegression(degree=poly_degree, \n                                            reg_factor=reg_factor,\n                                            learning_rate=0.001,\n                                            n_iterations=10000)\n            model.fit(_X_train, _y_train)\n            y_pred = model.predict(_X_test)\n            _mse = mean_squared_error(_y_test, y_pred)\n            mse += _mse\n        mse /= k\n\n        # Print the mean squared error\n        print (\"\\tMean Squared Error: %s (regularization: %s)\" % (mse, reg_factor))\n\n        # Save reg. constant that gave lowest error\n        if mse < lowest_error:\n            best_reg_factor = reg_factor\n            lowest_error = mse\n\n    # Make final prediction\n    model = PolynomialRidgeRegression(degree=poly_degree, \n                                    reg_factor=best_reg_factor,\n                                    learning_rate=0.001,\n                                    n_iterations=10000)\n    model.fit(X_train, y_train)\n    y_pred = model.predict(X_test)\n    mse = mean_squared_error(y_test, y_pred)\n    print (\"Mean squared error: %s (given by reg. factor: %s)\" % (lowest_error, best_reg_factor))\n\n    y_pred_line = model.predict(X)\n\n    # Color map\n    cmap = plt.get_cmap('viridis')\n\n    # Plot the results\n    m1 = plt.scatter(366 * X_train, y_train, color=cmap(0.9), s=10)\n    m2 = plt.scatter(366 * X_test, y_test, color=cmap(0.5), s=10)\n    plt.plot(366 * X, y_pred_line, color='black', linewidth=2, label=\"Prediction\")\n    plt.suptitle(\"Polynomial Ridge Regression\")\n    plt.title(\"MSE: %.2f\" % mse, fontsize=10)\n    plt.xlabel('Day')\n    plt.ylabel('Temperature in Celcius')\n    plt.legend((m1, m2), (\"Training data\", \"Test data\"), loc='lower right')\n    plt.show()\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "mlfromscratch/examples/principal_component_analysis.py",
    "content": "from sklearn import datasets\nimport matplotlib.pyplot as plt\nimport matplotlib.cm as cmx\nimport matplotlib.colors as colors\nimport numpy as np\nfrom mlfromscratch.unsupervised_learning import PCA\n\ndef main():\n\n    # Demo of how to reduce the dimensionality of the data to two dimension\n    # and plot the results. \n\n    # Load the dataset\n    data = datasets.load_digits()\n    X = data.data\n    y = data.target\n\n    # Project the data onto the 2 primary principal components\n    X_trans = PCA().transform(X, 2)\n\n    x1 = X_trans[:, 0]\n    x2 = X_trans[:, 1]\n\n    cmap = plt.get_cmap('viridis')\n    colors = [cmap(i) for i in np.linspace(0, 1, len(np.unique(y)))]\n\n    class_distr = []\n    # Plot the different class distributions\n    for i, l in enumerate(np.unique(y)):\n        _x1 = x1[y == l]\n        _x2 = x2[y == l]\n        _y = y[y == l]\n        class_distr.append(plt.scatter(_x1, _x2, color=colors[i]))\n\n    # Add a legend\n    plt.legend(class_distr, y, loc=1)\n\n    # Axis labels\n    plt.suptitle(\"PCA Dimensionality Reduction\")\n    plt.title(\"Digit Dataset\")\n    plt.xlabel('Principal Component 1')\n    plt.ylabel('Principal Component 2')\n    plt.show()\n\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/random_forest.py",
    "content": "from __future__ import division, print_function\nimport numpy as np\nfrom sklearn import datasets\nfrom mlfromscratch.utils import train_test_split, accuracy_score, Plot\nfrom mlfromscratch.supervised_learning import RandomForest\n\ndef main():\n    data = datasets.load_digits()\n    X = data.data\n    y = data.target\n\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, seed=2)\n\n    clf = RandomForest(n_estimators=100)\n    clf.fit(X_train, y_train)\n    y_pred = clf.predict(X_test)\n\n    accuracy = accuracy_score(y_test, y_pred)\n\n    print (\"Accuracy:\", accuracy)\n\n    Plot().plot_in_2d(X_test, y_pred, title=\"Random Forest\", accuracy=accuracy, legend_labels=data.target_names)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "mlfromscratch/examples/recurrent_neural_network.py",
    "content": "from __future__ import print_function\nimport matplotlib.pyplot as plt\nimport numpy as np\n\nfrom mlfromscratch.deep_learning import NeuralNetwork\nfrom mlfromscratch.utils import train_test_split, to_categorical, normalize, Plot\nfrom mlfromscratch.utils import get_random_subsets, shuffle_data, accuracy_score\nfrom mlfromscratch.deep_learning.optimizers import StochasticGradientDescent, Adam, RMSprop, Adagrad, Adadelta\nfrom mlfromscratch.deep_learning.loss_functions import CrossEntropy\nfrom mlfromscratch.utils.misc import bar_widgets\nfrom mlfromscratch.deep_learning.layers import RNN, Activation\n\n\ndef main():\n\n    optimizer = Adam()\n\n    def gen_mult_ser(nums):\n        \"\"\" Method which generates multiplication series \"\"\"\n        X = np.zeros([nums, 10, 61], dtype=float)\n        y = np.zeros([nums, 10, 61], dtype=float)\n        for i in range(nums):\n            start = np.random.randint(2, 7)\n            mult_ser = np.linspace(start, start*10, num=10, dtype=int)\n            X[i] = to_categorical(mult_ser, n_col=61)\n            y[i] = np.roll(X[i], -1, axis=0)\n        y[:, -1, 1] = 1 # Mark endpoint as 1\n        return X, y\n\n\n    def gen_num_seq(nums):\n        \"\"\" Method which generates sequence of numbers \"\"\"\n        X = np.zeros([nums, 10, 20], dtype=float)\n        y = np.zeros([nums, 10, 20], dtype=float)\n        for i in range(nums):\n            start = np.random.randint(0, 10)\n            num_seq = np.arange(start, start+10)\n            X[i] = to_categorical(num_seq, n_col=20)\n            y[i] = np.roll(X[i], -1, axis=0)\n        y[:, -1, 1] = 1 # Mark endpoint as 1\n        return X, y\n\n    X, y = gen_mult_ser(3000)\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4)\n\n    # Model definition\n    clf = NeuralNetwork(optimizer=optimizer,\n                        loss=CrossEntropy)\n    clf.add(RNN(10, activation=\"tanh\", bptt_trunc=5, input_shape=(10, 61)))\n    clf.add(Activation('softmax'))\n    clf.summary(\"RNN\")\n\n    # Print a problem instance and the correct solution\n    tmp_X = np.argmax(X_train[0], axis=1)\n    tmp_y = np.argmax(y_train[0], axis=1)\n    print (\"Number Series Problem:\")\n    print (\"X = [\" + \" \".join(tmp_X.astype(\"str\")) + \"]\")\n    print (\"y = [\" + \" \".join(tmp_y.astype(\"str\")) + \"]\")\n    print ()\n\n    train_err, _ = clf.fit(X_train, y_train, n_epochs=500, batch_size=512)\n\n    # Predict labels of the test data\n    y_pred = np.argmax(clf.predict(X_test), axis=2)\n    y_test = np.argmax(y_test, axis=2)\n\n    print ()\n    print (\"Results:\")\n    for i in range(5):\n        # Print a problem instance and the correct solution\n        tmp_X = np.argmax(X_test[i], axis=1)\n        tmp_y1 = y_test[i]\n        tmp_y2 = y_pred[i]\n        print (\"X      = [\" + \" \".join(tmp_X.astype(\"str\")) + \"]\")\n        print (\"y_true = [\" + \" \".join(tmp_y1.astype(\"str\")) + \"]\")\n        print (\"y_pred = [\" + \" \".join(tmp_y2.astype(\"str\")) + \"]\")\n        print ()\n    \n    accuracy = np.mean(accuracy_score(y_test, y_pred))\n    print (\"Accuracy:\", accuracy)\n\n    training = plt.plot(range(500), train_err, label=\"Training Error\")\n    plt.title(\"Error Plot\")\n    plt.ylabel('Training Error')\n    plt.xlabel('Iterations')\n    plt.show()\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/restricted_boltzmann_machine.py",
    "content": "import logging\n\nimport numpy as np\nfrom sklearn import datasets\nfrom sklearn.datasets import fetch_mldata\nimport matplotlib.pyplot as plt\n\nfrom mlfromscratch.unsupervised_learning import RBM\n\nlogging.basicConfig(level=logging.DEBUG)\n\ndef main():\n\n    mnist = fetch_mldata('MNIST original')\n\n    X = mnist.data / 255.0\n    y = mnist.target\n\n    # Select the samples of the digit 2\n    X = X[y == 2]\n\n    # Limit dataset to 500 samples\n    idx = np.random.choice(range(X.shape[0]), size=500, replace=False)\n    X = X[idx]\n\n    rbm = RBM(n_hidden=50, n_iterations=200, batch_size=25, learning_rate=0.001)\n    rbm.fit(X)\n\n    # Training error plot\n    training, = plt.plot(range(len(rbm.training_errors)), rbm.training_errors, label=\"Training Error\")\n    plt.legend(handles=[training])\n    plt.title(\"Error Plot\")\n    plt.ylabel('Error')\n    plt.xlabel('Iterations')\n    plt.show()\n\n    # Get the images that were reconstructed during training\n    gen_imgs = rbm.training_reconstructions\n\n    # Plot the reconstructed images during the first iteration\n    fig, axs = plt.subplots(5, 5)\n    plt.suptitle(\"Restricted Boltzmann Machine - First Iteration\")\n    cnt = 0\n    for i in range(5):\n        for j in range(5):\n            axs[i,j].imshow(gen_imgs[0][cnt].reshape((28, 28)), cmap='gray')\n            axs[i,j].axis('off')\n            cnt += 1\n    fig.savefig(\"rbm_first.png\")\n    plt.close()\n\n    # Plot the images during the last iteration\n    fig, axs = plt.subplots(5, 5)\n    plt.suptitle(\"Restricted Boltzmann Machine - Last Iteration\")\n    cnt = 0\n    for i in range(5):\n        for j in range(5):\n            axs[i,j].imshow(gen_imgs[-1][cnt].reshape((28, 28)), cmap='gray')\n            axs[i,j].axis('off')\n            cnt += 1\n    fig.savefig(\"rbm_last.png\")\n    plt.close()\n\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/ridge_regression.py",
    "content": "from __future__ import print_function\nimport matplotlib.pyplot as plt\nimport numpy as np\nimport pandas as pd\n# Import helper functions\nfrom mlfromscratch.supervised_learning import PolynomialRidgeRegression\nfrom mlfromscratch.utils import k_fold_cross_validation_sets, normalize, Plot\nfrom mlfromscratch.utils import train_test_split, polynomial_features, mean_squared_error\n\n\ndef main():\n\n    # Load temperature data\n    data = pd.read_csv('mlfromscratch/data/TempLinkoping2016.txt', sep=\"\\t\")\n\n    time = np.atleast_2d(data[\"time\"].values).T\n    temp = data[\"temp\"].values\n\n    X = time # fraction of the year [0, 1]\n    y = temp\n\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4)\n\n    poly_degree = 15\n\n    # Finding regularization constant using cross validation\n    lowest_error = float(\"inf\")\n    best_reg_factor = None\n    print (\"Finding regularization constant using cross validation:\")\n    k = 10\n    for reg_factor in np.arange(0, 0.1, 0.01):\n        cross_validation_sets = k_fold_cross_validation_sets(\n            X_train, y_train, k=k)\n        mse = 0\n        for _X_train, _X_test, _y_train, _y_test in cross_validation_sets:\n            model = PolynomialRidgeRegression(degree=poly_degree, \n                                            reg_factor=reg_factor,\n                                            learning_rate=0.001,\n                                            n_iterations=10000)\n            model.fit(_X_train, _y_train)\n            y_pred = model.predict(_X_test)\n            _mse = mean_squared_error(_y_test, y_pred)\n            mse += _mse\n        mse /= k\n\n        # Print the mean squared error\n        print (\"\\tMean Squared Error: %s (regularization: %s)\" % (mse, reg_factor))\n\n        # Save reg. constant that gave lowest error\n        if mse < lowest_error:\n            best_reg_factor = reg_factor\n            lowest_error = mse\n\n    # Make final prediction\n    model = PolynomialRidgeRegression(degree=poly_degree, \n                                    reg_factor=reg_factor,\n                                    learning_rate=0.001,\n                                    n_iterations=10000)\n    model.fit(X_train, y_train)\n\n    y_pred = model.predict(X_test)\n    mse = mean_squared_error(y_test, y_pred)\n    print (\"Mean squared error: %s (given by reg. factor: %s)\" % (mse, reg_factor))\n\n    y_pred_line = model.predict(X)\n\n    # Color map\n    cmap = plt.get_cmap('viridis')\n\n    # Plot the results\n    m1 = plt.scatter(366 * X_train, y_train, color=cmap(0.9), s=10)\n    m2 = plt.scatter(366 * X_test, y_test, color=cmap(0.5), s=10)\n    plt.plot(366 * X, y_pred_line, color='black', linewidth=2, label=\"Prediction\")\n    plt.suptitle(\"Polynomial Ridge Regression\")\n    plt.title(\"MSE: %.2f\" % mse, fontsize=10)\n    plt.xlabel('Day')\n    plt.ylabel('Temperature in Celcius')\n    plt.legend((m1, m2), (\"Training data\", \"Test data\"), loc='lower right')\n    plt.show()\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "mlfromscratch/examples/support_vector_machine.py",
    "content": "from __future__ import division, print_function\nimport numpy as np\nfrom sklearn import datasets\n\n# Import helper functions\nfrom mlfromscratch.utils import train_test_split, normalize, accuracy_score, Plot\nfrom mlfromscratch.utils.kernels import *\nfrom mlfromscratch.supervised_learning import SupportVectorMachine\n\ndef main():\n    data = datasets.load_iris()\n    X = normalize(data.data[data.target != 0])\n    y = data.target[data.target != 0]\n    y[y == 1] = -1\n    y[y == 2] = 1\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)\n\n    clf = SupportVectorMachine(kernel=polynomial_kernel, power=4, coef=1)\n    clf.fit(X_train, y_train)\n    y_pred = clf.predict(X_test)\n\n    accuracy = accuracy_score(y_test, y_pred)\n\n    print (\"Accuracy:\", accuracy)\n\n    # Reduce dimension to two using PCA and plot the results\n    Plot().plot_in_2d(X_test, y_pred, title=\"Support Vector Machine\", accuracy=accuracy)\n\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/examples/xgboost.py",
    "content": "from __future__ import division, print_function\nimport numpy as np\nfrom sklearn import datasets\nimport matplotlib.pyplot as plt\nimport progressbar\nfrom mlfromscratch.utils import train_test_split, standardize, to_categorical, normalize\nfrom mlfromscratch.utils import mean_squared_error, accuracy_score, Plot\nfrom mlfromscratch.supervised_learning import XGBoost\n\ndef main():\n\n    print (\"-- XGBoost --\")\n\n    data = datasets.load_iris()\n    X = data.data\n    y = data.target\n\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, seed=2)  \n\n    clf = XGBoost()\n    clf.fit(X_train, y_train)\n    y_pred = clf.predict(X_test)\n\n    accuracy = accuracy_score(y_test, y_pred)\n\n    print (\"Accuracy:\", accuracy)\n\n    Plot().plot_in_2d(X_test, y_pred, \n        title=\"XGBoost\", \n    accuracy=accuracy, \n    legend_labels=data.target_names)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "mlfromscratch/reinforcement_learning/__init__.py",
    "content": "from .deep_q_network import DeepQNetwork"
  },
  {
    "path": "mlfromscratch/reinforcement_learning/deep_q_network.py",
    "content": "from __future__ import print_function, division\nimport random\nimport numpy as np\nimport gym\nfrom collections import deque\n\n\nclass DeepQNetwork():\n    \"\"\"Q-Learning with deep neural network to learn the control policy. \n    Uses a deep neural network model to predict the expected utility (Q-value) of executing an action in a given state. \n\n    Reference: https://arxiv.org/abs/1312.5602\n    Parameters:\n    -----------\n    env_name: string\n        The environment that the agent will explore. \n        Check: https://gym.openai.com/envs\n    epsilon: float\n        The epsilon-greedy value. The probability that the agent should select a random action instead of\n        the action that will maximize the expected utility. \n    gamma: float\n        Determines how much the agent should consider future rewards. \n    decay_rate: float\n        The rate of decay for the epsilon value after each epoch.\n    min_epsilon: float\n        The value which epsilon will approach as the training progresses.\n    \"\"\"\n    def __init__(self, env_name='CartPole-v1', epsilon=1, gamma=0.9, decay_rate=0.005, min_epsilon=0.1):\n        self.epsilon = epsilon\n        self.gamma = gamma\n        self.decay_rate = decay_rate\n        self.min_epsilon = min_epsilon\n        self.memory_size = 300\n        self.memory = []\n\n        # Initialize the environment\n        self.env = gym.make(env_name)\n        self.n_states = self.env.observation_space.shape[0]\n        self.n_actions = self.env.action_space.n\n    \n    def set_model(self, model):\n        self.model = model(n_inputs=self.n_states, n_outputs=self.n_actions)\n\n    def _select_action(self, state):\n        if np.random.rand() < self.epsilon:\n            # Choose action randomly\n            action = np.random.randint(self.n_actions)\n        else:\n            # Take action with highest predicted utility given state\n            action = np.argmax(self.model.predict(state), axis=1)[0]\n\n        return action\n\n    def _memorize(self, state, action, reward, new_state, done):\n        self.memory.append((state, action, reward, new_state, done))\n        # Make sure we restrict memory size to specified limit\n        if len(self.memory) > self.memory_size:\n            self.memory.pop(0)\n\n    def _construct_training_set(self, replay):\n        # Select states and new states from replay\n        states = np.array([a[0] for a in replay])\n        new_states = np.array([a[3] for a in replay])\n\n        # Predict the expected utility of current state and new state\n        Q = self.model.predict(states)\n        Q_new = self.model.predict(new_states)\n\n        replay_size = len(replay)\n        X = np.empty((replay_size, self.n_states))\n        y = np.empty((replay_size, self.n_actions))\n        \n        # Construct training set\n        for i in range(replay_size):\n            state_r, action_r, reward_r, new_state_r, done_r = replay[i]\n\n            target = Q[i]\n            target[action_r] = reward_r\n            # If we're done the utility is simply the reward of executing action a in\n            # state s, otherwise we add the expected maximum future reward as well\n            if not done_r:\n                target[action_r] += self.gamma * np.amax(Q_new[i])\n\n            X[i] = state_r\n            y[i] = target\n\n        return X, y\n\n    def train(self, n_epochs=500, batch_size=32):\n        max_reward = 0\n\n        for epoch in range(n_epochs):\n            state = self.env.reset()\n            total_reward = 0\n\n            epoch_loss = []\n            while True:\n\n                action = self._select_action(state)\n                # Take a step\n                new_state, reward, done, _ = self.env.step(action)\n\n                self._memorize(state, action, reward, new_state, done)\n\n                # Sample replay batch from memory\n                _batch_size = min(len(self.memory), batch_size)\n                replay = random.sample(self.memory, _batch_size)\n\n                # Construct training set from replay\n                X, y = self._construct_training_set(replay)\n\n                # Learn control policy\n                loss = self.model.train_on_batch(X, y)\n                epoch_loss.append(loss)\n\n                total_reward += reward\n                state = new_state\n\n                if done: break\n            \n            epoch_loss = np.mean(epoch_loss)\n\n            # Reduce the epsilon parameter\n            self.epsilon = self.min_epsilon + (1.0 - self.min_epsilon) * np.exp(-self.decay_rate * epoch)\n            \n            max_reward = max(max_reward, total_reward)\n\n            print (\"%d [Loss: %.4f, Reward: %s, Epsilon: %.4f, Max Reward: %s]\" % (epoch, epoch_loss, total_reward, self.epsilon, max_reward))\n\n        print (\"Training Finished\")\n\n    def play(self, n_epochs):\n        # self.env = gym.wrappers.Monitor(self.env, '/tmp/cartpole-experiment-1', force=True)\n        for epoch in range(n_epochs):\n            state = self.env.reset()\n            total_reward = 0\n            while True:\n                self.env.render()\n                action = np.argmax(self.model.predict(state), axis=1)[0]\n                state, reward, done, _ = self.env.step(action)\n                total_reward += reward\n                if done: break\n            print (\"%d Reward: %s\" % (epoch, total_reward))\n        self.env.close()\n"
  },
  {
    "path": "mlfromscratch/supervised_learning/__init__.py",
    "content": "from .adaboost import Adaboost\nfrom .bayesian_regression import BayesianRegression\nfrom .decision_tree import RegressionTree, ClassificationTree, XGBoostRegressionTree\nfrom .gradient_boosting import GradientBoostingClassifier, GradientBoostingRegressor\nfrom .k_nearest_neighbors import KNN\nfrom .linear_discriminant_analysis import LDA\nfrom .regression import LinearRegression, PolynomialRegression, LassoRegression\nfrom .regression import RidgeRegression, PolynomialRidgeRegression, ElasticNet\nfrom .logistic_regression import LogisticRegression\nfrom .multi_class_lda import MultiClassLDA\nfrom .naive_bayes import NaiveBayes\nfrom .perceptron import Perceptron\nfrom .random_forest import RandomForest\nfrom .support_vector_machine import SupportVectorMachine\nfrom .xgboost import XGBoost\nfrom .neuroevolution import Neuroevolution\nfrom .particle_swarm_optimization import ParticleSwarmOptimizedNN"
  },
  {
    "path": "mlfromscratch/supervised_learning/adaboost.py",
    "content": "from __future__ import division, print_function\nimport numpy as np\nimport math\nfrom sklearn import datasets\nimport matplotlib.pyplot as plt\nimport pandas as pd\n\n# Import helper functions\nfrom mlfromscratch.utils import train_test_split, accuracy_score, Plot\n\n# Decision stump used as weak classifier in this impl. of Adaboost\nclass DecisionStump():\n    def __init__(self):\n        # Determines if sample shall be classified as -1 or 1 given threshold\n        self.polarity = 1\n        # The index of the feature used to make classification\n        self.feature_index = None\n        # The threshold value that the feature should be measured against\n        self.threshold = None\n        # Value indicative of the classifier's accuracy\n        self.alpha = None\n\nclass Adaboost():\n    \"\"\"Boosting method that uses a number of weak classifiers in \n    ensemble to make a strong classifier. This implementation uses decision\n    stumps, which is a one level Decision Tree. \n\n    Parameters:\n    -----------\n    n_clf: int\n        The number of weak classifiers that will be used. \n    \"\"\"\n    def __init__(self, n_clf=5):\n        self.n_clf = n_clf\n\n    def fit(self, X, y):\n        n_samples, n_features = np.shape(X)\n\n        # Initialize weights to 1/N\n        w = np.full(n_samples, (1 / n_samples))\n        \n        self.clfs = []\n        # Iterate through classifiers\n        for _ in range(self.n_clf):\n            clf = DecisionStump()\n            # Minimum error given for using a certain feature value threshold\n            # for predicting sample label\n            min_error = float('inf')\n            # Iterate throught every unique feature value and see what value\n            # makes the best threshold for predicting y\n            for feature_i in range(n_features):\n                feature_values = np.expand_dims(X[:, feature_i], axis=1)\n                unique_values = np.unique(feature_values)\n                # Try every unique feature value as threshold\n                for threshold in unique_values:\n                    p = 1\n                    # Set all predictions to '1' initially\n                    prediction = np.ones(np.shape(y))\n                    # Label the samples whose values are below threshold as '-1'\n                    prediction[X[:, feature_i] < threshold] = -1\n                    # Error = sum of weights of misclassified samples\n                    error = sum(w[y != prediction])\n                    \n                    # If the error is over 50% we flip the polarity so that samples that\n                    # were classified as 0 are classified as 1, and vice versa\n                    # E.g error = 0.8 => (1 - error) = 0.2\n                    if error > 0.5:\n                        error = 1 - error\n                        p = -1\n\n                    # If this threshold resulted in the smallest error we save the\n                    # configuration\n                    if error < min_error:\n                        clf.polarity = p\n                        clf.threshold = threshold\n                        clf.feature_index = feature_i\n                        min_error = error\n            # Calculate the alpha which is used to update the sample weights,\n            # Alpha is also an approximation of this classifier's proficiency\n            clf.alpha = 0.5 * math.log((1.0 - min_error) / (min_error + 1e-10))\n            # Set all predictions to '1' initially\n            predictions = np.ones(np.shape(y))\n            # The indexes where the sample values are below threshold\n            negative_idx = (clf.polarity * X[:, clf.feature_index] < clf.polarity * clf.threshold)\n            # Label those as '-1'\n            predictions[negative_idx] = -1\n            # Calculate new weights \n            # Missclassified samples gets larger weights and correctly classified samples smaller\n            w *= np.exp(-clf.alpha * y * predictions)\n            # Normalize to one\n            w /= np.sum(w)\n\n            # Save classifier\n            self.clfs.append(clf)\n\n    def predict(self, X):\n        n_samples = np.shape(X)[0]\n        y_pred = np.zeros((n_samples, 1))\n        # For each classifier => label the samples\n        for clf in self.clfs:\n            # Set all predictions to '1' initially\n            predictions = np.ones(np.shape(y_pred))\n            # The indexes where the sample values are below threshold\n            negative_idx = (clf.polarity * X[:, clf.feature_index] < clf.polarity * clf.threshold)\n            # Label those as '-1'\n            predictions[negative_idx] = -1\n            # Add predictions weighted by the classifiers alpha\n            # (alpha indicative of classifier's proficiency)\n            y_pred += clf.alpha * predictions\n\n        # Return sign of prediction sum\n        y_pred = np.sign(y_pred).flatten()\n\n        return y_pred\n\n\ndef main():\n    data = datasets.load_digits()\n    X = data.data\n    y = data.target\n\n    digit1 = 1\n    digit2 = 8\n    idx = np.append(np.where(y == digit1)[0], np.where(y == digit2)[0])\n    y = data.target[idx]\n    # Change labels to {-1, 1}\n    y[y == digit1] = -1\n    y[y == digit2] = 1\n    X = data.data[idx]\n\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5)\n\n    # Adaboost classification with 5 weak classifiers\n    clf = Adaboost(n_clf=5)\n    clf.fit(X_train, y_train)\n    y_pred = clf.predict(X_test)\n\n    accuracy = accuracy_score(y_test, y_pred)\n    print (\"Accuracy:\", accuracy)\n\n    # Reduce dimensions to 2d using pca and plot the results\n    Plot().plot_in_2d(X_test, y_pred, title=\"Adaboost\", accuracy=accuracy)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "mlfromscratch/supervised_learning/bayesian_regression.py",
    "content": "from __future__ import print_function, division\nimport numpy as np\nfrom scipy.stats import chi2, multivariate_normal\nfrom mlfromscratch.utils import mean_squared_error, train_test_split, polynomial_features\n\n\n\nclass BayesianRegression(object):\n    \"\"\"Bayesian regression model. If poly_degree is specified the features will\n    be transformed to with a polynomial basis function, which allows for polynomial\n    regression. Assumes Normal prior and likelihood for the weights and scaled inverse\n    chi-squared prior and likelihood for the variance of the weights.\n\n    Parameters:\n    -----------\n    n_draws: float\n        The number of simulated draws from the posterior of the parameters.\n    mu0: array\n        The mean values of the prior Normal distribution of the parameters.\n    omega0: array\n        The precision matrix of the prior Normal distribution of the parameters.\n    nu0: float\n        The degrees of freedom of the prior scaled inverse chi squared distribution.\n    sigma_sq0: float\n        The scale parameter of the prior scaled inverse chi squared distribution.\n    poly_degree: int\n        The polynomial degree that the features should be transformed to. Allows\n        for polynomial regression.\n    cred_int: float\n        The credible interval (ETI in this impl.). 95 => 95% credible interval of the posterior\n        of the parameters.\n\n    Reference:\n        https://github.com/mattiasvillani/BayesLearnCourse/raw/master/Slides/BayesLearnL5.pdf\n    \"\"\"\n    def __init__(self, n_draws, mu0, omega0, nu0, sigma_sq0, poly_degree=0, cred_int=95):\n        self.w = None\n        self.n_draws = n_draws\n        self.poly_degree = poly_degree\n        self.cred_int = cred_int\n\n        # Prior parameters\n        self.mu0 = mu0\n        self.omega0 = omega0\n        self.nu0 = nu0\n        self.sigma_sq0 = sigma_sq0\n\n    # Allows for simulation from the scaled inverse chi squared\n    # distribution. Assumes the variance is distributed according to\n    # this distribution.\n    # Reference:\n    #   https://en.wikipedia.org/wiki/Scaled_inverse_chi-squared_distribution\n    def _draw_scaled_inv_chi_sq(self, n, df, scale):\n        X = chi2.rvs(size=n, df=df)\n        sigma_sq = df * scale / X\n        return sigma_sq\n\n    def fit(self, X, y):\n\n        # If polynomial transformation\n        if self.poly_degree:\n            X = polynomial_features(X, degree=self.poly_degree)\n\n        n_samples, n_features = np.shape(X)\n\n        X_X = X.T.dot(X)\n\n        # Least squares approximate of beta\n        beta_hat = np.linalg.pinv(X_X).dot(X.T).dot(y)\n\n        # The posterior parameters can be determined analytically since we assume\n        # conjugate priors for the likelihoods.\n\n        # Normal prior / likelihood => Normal posterior\n        mu_n = np.linalg.pinv(X_X + self.omega0).dot(X_X.dot(beta_hat)+self.omega0.dot(self.mu0))\n        omega_n = X_X + self.omega0\n        # Scaled inverse chi-squared prior / likelihood => Scaled inverse chi-squared posterior\n        nu_n = self.nu0 + n_samples\n        sigma_sq_n = (1.0/nu_n)*(self.nu0*self.sigma_sq0 + \\\n            (y.T.dot(y) + self.mu0.T.dot(self.omega0).dot(self.mu0) - mu_n.T.dot(omega_n.dot(mu_n))))\n\n        # Simulate parameter values for n_draws\n        beta_draws = np.empty((self.n_draws, n_features))\n        for i in range(self.n_draws):\n            sigma_sq = self._draw_scaled_inv_chi_sq(n=1, df=nu_n, scale=sigma_sq_n)\n            beta = multivariate_normal.rvs(size=1, mean=mu_n[:,0], cov=sigma_sq*np.linalg.pinv(omega_n))\n            # Save parameter draws\n            beta_draws[i, :] = beta\n\n        # Select the mean of the simulated variables as the ones used to make predictions\n        self.w = np.mean(beta_draws, axis=0)\n\n        # Lower and upper boundary of the credible interval\n        l_eti = 50 - self.cred_int/2\n        u_eti = 50 + self.cred_int/2\n        self.eti = np.array([[np.percentile(beta_draws[:,i], q=l_eti), np.percentile(beta_draws[:,i], q=u_eti)] \\\n                                for i in range(n_features)])\n\n    def predict(self, X, eti=False):\n\n        # If polynomial transformation\n        if self.poly_degree:\n            X = polynomial_features(X, degree=self.poly_degree)\n\n        y_pred = X.dot(self.w)\n        # If the lower and upper boundaries for the 95%\n        # equal tail interval should be returned\n        if eti:\n            lower_w = self.eti[:, 0]\n            upper_w = self.eti[:, 1]\n            y_lower_pred = X.dot(lower_w)\n            y_upper_pred = X.dot(upper_w)\n            return y_pred, y_lower_pred, y_upper_pred\n\n        return y_pred\n"
  },
  {
    "path": "mlfromscratch/supervised_learning/decision_tree.py",
    "content": "from __future__ import division, print_function\nimport numpy as np\n\nfrom mlfromscratch.utils import divide_on_feature, train_test_split, standardize, mean_squared_error\nfrom mlfromscratch.utils import calculate_entropy, accuracy_score, calculate_variance\n\nclass DecisionNode():\n    \"\"\"Class that represents a decision node or leaf in the decision tree\n\n    Parameters:\n    -----------\n    feature_i: int\n        Feature index which we want to use as the threshold measure.\n    threshold: float\n        The value that we will compare feature values at feature_i against to\n        determine the prediction.\n    value: float\n        The class prediction if classification tree, or float value if regression tree.\n    true_branch: DecisionNode\n        Next decision node for samples where features value met the threshold.\n    false_branch: DecisionNode\n        Next decision node for samples where features value did not meet the threshold.\n    \"\"\"\n    def __init__(self, feature_i=None, threshold=None,\n                 value=None, true_branch=None, false_branch=None):\n        self.feature_i = feature_i          # Index for the feature that is tested\n        self.threshold = threshold          # Threshold value for feature\n        self.value = value                  # Value if the node is a leaf in the tree\n        self.true_branch = true_branch      # 'Left' subtree\n        self.false_branch = false_branch    # 'Right' subtree\n\n\n# Super class of RegressionTree and ClassificationTree\nclass DecisionTree(object):\n    \"\"\"Super class of RegressionTree and ClassificationTree.\n\n    Parameters:\n    -----------\n    min_samples_split: int\n        The minimum number of samples needed to make a split when building a tree.\n    min_impurity: float\n        The minimum impurity required to split the tree further.\n    max_depth: int\n        The maximum depth of a tree.\n    loss: function\n        Loss function that is used for Gradient Boosting models to calculate impurity.\n    \"\"\"\n    def __init__(self, min_samples_split=2, min_impurity=1e-7,\n                 max_depth=float(\"inf\"), loss=None):\n        self.root = None  # Root node in dec. tree\n        # Minimum n of samples to justify split\n        self.min_samples_split = min_samples_split\n        # The minimum impurity to justify split\n        self.min_impurity = min_impurity\n        # The maximum depth to grow the tree to\n        self.max_depth = max_depth\n        # Function to calculate impurity (classif.=>info gain, regr=>variance reduct.)\n        self._impurity_calculation = None\n        # Function to determine prediction of y at leaf\n        self._leaf_value_calculation = None\n        # If y is one-hot encoded (multi-dim) or not (one-dim)\n        self.one_dim = None\n        # If Gradient Boost\n        self.loss = loss\n\n    def fit(self, X, y, loss=None):\n        \"\"\" Build decision tree \"\"\"\n        self.one_dim = len(np.shape(y)) == 1\n        self.root = self._build_tree(X, y)\n        self.loss=None\n\n    def _build_tree(self, X, y, current_depth=0):\n        \"\"\" Recursive method which builds out the decision tree and splits X and respective y\n        on the feature of X which (based on impurity) best separates the data\"\"\"\n\n        largest_impurity = 0\n        best_criteria = None    # Feature index and threshold\n        best_sets = None        # Subsets of the data\n\n        # Check if expansion of y is needed\n        if len(np.shape(y)) == 1:\n            y = np.expand_dims(y, axis=1)\n\n        # Add y as last column of X\n        Xy = np.concatenate((X, y), axis=1)\n\n        n_samples, n_features = np.shape(X)\n\n        if n_samples >= self.min_samples_split and current_depth <= self.max_depth:\n            # Calculate the impurity for each feature\n            for feature_i in range(n_features):\n                # All values of feature_i\n                feature_values = np.expand_dims(X[:, feature_i], axis=1)\n                unique_values = np.unique(feature_values)\n\n                # Iterate through all unique values of feature column i and\n                # calculate the impurity\n                for threshold in unique_values:\n                    # Divide X and y depending on if the feature value of X at index feature_i\n                    # meets the threshold\n                    Xy1, Xy2 = divide_on_feature(Xy, feature_i, threshold)\n\n                    if len(Xy1) > 0 and len(Xy2) > 0:\n                        # Select the y-values of the two sets\n                        y1 = Xy1[:, n_features:]\n                        y2 = Xy2[:, n_features:]\n\n                        # Calculate impurity\n                        impurity = self._impurity_calculation(y, y1, y2)\n\n                        # If this threshold resulted in a higher information gain than previously\n                        # recorded save the threshold value and the feature\n                        # index\n                        if impurity > largest_impurity:\n                            largest_impurity = impurity\n                            best_criteria = {\"feature_i\": feature_i, \"threshold\": threshold}\n                            best_sets = {\n                                \"leftX\": Xy1[:, :n_features],   # X of left subtree\n                                \"lefty\": Xy1[:, n_features:],   # y of left subtree\n                                \"rightX\": Xy2[:, :n_features],  # X of right subtree\n                                \"righty\": Xy2[:, n_features:]   # y of right subtree\n                                }\n\n        if largest_impurity > self.min_impurity:\n            # Build subtrees for the right and left branches\n            true_branch = self._build_tree(best_sets[\"leftX\"], best_sets[\"lefty\"], current_depth + 1)\n            false_branch = self._build_tree(best_sets[\"rightX\"], best_sets[\"righty\"], current_depth + 1)\n            return DecisionNode(feature_i=best_criteria[\"feature_i\"], threshold=best_criteria[\n                                \"threshold\"], true_branch=true_branch, false_branch=false_branch)\n\n        # We're at leaf => determine value\n        leaf_value = self._leaf_value_calculation(y)\n\n        return DecisionNode(value=leaf_value)\n\n\n    def predict_value(self, x, tree=None):\n        \"\"\" Do a recursive search down the tree and make a prediction of the data sample by the\n            value of the leaf that we end up at \"\"\"\n\n        if tree is None:\n            tree = self.root\n\n        # If we have a value (i.e we're at a leaf) => return value as the prediction\n        if tree.value is not None:\n            return tree.value\n\n        # Choose the feature that we will test\n        feature_value = x[tree.feature_i]\n\n        # Determine if we will follow left or right branch\n        branch = tree.false_branch\n        if isinstance(feature_value, int) or isinstance(feature_value, float):\n            if feature_value >= tree.threshold:\n                branch = tree.true_branch\n        elif feature_value == tree.threshold:\n            branch = tree.true_branch\n\n        # Test subtree\n        return self.predict_value(x, branch)\n\n    def predict(self, X):\n        \"\"\" Classify samples one by one and return the set of labels \"\"\"\n        y_pred = [self.predict_value(sample) for sample in X]\n        return y_pred\n\n    def print_tree(self, tree=None, indent=\" \"):\n        \"\"\" Recursively print the decision tree \"\"\"\n        if not tree:\n            tree = self.root\n\n        # If we're at leaf => print the label\n        if tree.value is not None:\n            print (tree.value)\n        # Go deeper down the tree\n        else:\n            # Print test\n            print (\"%s:%s? \" % (tree.feature_i, tree.threshold))\n            # Print the true scenario\n            print (\"%sT->\" % (indent), end=\"\")\n            self.print_tree(tree.true_branch, indent + indent)\n            # Print the false scenario\n            print (\"%sF->\" % (indent), end=\"\")\n            self.print_tree(tree.false_branch, indent + indent)\n\n\n\nclass XGBoostRegressionTree(DecisionTree):\n    \"\"\"\n    Regression tree for XGBoost\n    - Reference -\n    http://xgboost.readthedocs.io/en/latest/model.html\n    \"\"\"\n\n    def _split(self, y):\n        \"\"\" y contains y_true in left half of the middle column and\n        y_pred in the right half. Split and return the two matrices \"\"\"\n        col = int(np.shape(y)[1]/2)\n        y, y_pred = y[:, :col], y[:, col:]\n        return y, y_pred\n\n    def _gain(self, y, y_pred):\n        nominator = np.power((y * self.loss.gradient(y, y_pred)).sum(), 2)\n        denominator = self.loss.hess(y, y_pred).sum()\n        return 0.5 * (nominator / denominator)\n\n    def _gain_by_taylor(self, y, y1, y2):\n        # Split\n        y, y_pred = self._split(y)\n        y1, y1_pred = self._split(y1)\n        y2, y2_pred = self._split(y2)\n\n        true_gain = self._gain(y1, y1_pred)\n        false_gain = self._gain(y2, y2_pred)\n        gain = self._gain(y, y_pred)\n        return true_gain + false_gain - gain\n\n    def _approximate_update(self, y):\n        # y split into y, y_pred\n        y, y_pred = self._split(y)\n        # Newton's Method\n        gradient = np.sum(y * self.loss.gradient(y, y_pred), axis=0)\n        hessian = np.sum(self.loss.hess(y, y_pred), axis=0)\n        update_approximation =  gradient / hessian\n\n        return update_approximation\n\n    def fit(self, X, y):\n        self._impurity_calculation = self._gain_by_taylor\n        self._leaf_value_calculation = self._approximate_update\n        super(XGBoostRegressionTree, self).fit(X, y)\n\n\nclass RegressionTree(DecisionTree):\n    def _calculate_variance_reduction(self, y, y1, y2):\n        var_tot = calculate_variance(y)\n        var_1 = calculate_variance(y1)\n        var_2 = calculate_variance(y2)\n        frac_1 = len(y1) / len(y)\n        frac_2 = len(y2) / len(y)\n\n        # Calculate the variance reduction\n        variance_reduction = var_tot - (frac_1 * var_1 + frac_2 * var_2)\n\n        return sum(variance_reduction)\n\n    def _mean_of_y(self, y):\n        value = np.mean(y, axis=0)\n        return value if len(value) > 1 else value[0]\n\n    def fit(self, X, y):\n        self._impurity_calculation = self._calculate_variance_reduction\n        self._leaf_value_calculation = self._mean_of_y\n        super(RegressionTree, self).fit(X, y)\n\nclass ClassificationTree(DecisionTree):\n    def _calculate_information_gain(self, y, y1, y2):\n        # Calculate information gain\n        p = len(y1) / len(y)\n        entropy = calculate_entropy(y)\n        info_gain = entropy - p * \\\n            calculate_entropy(y1) - (1 - p) * \\\n            calculate_entropy(y2)\n\n        return info_gain\n\n    def _majority_vote(self, y):\n        most_common = None\n        max_count = 0\n        for label in np.unique(y):\n            # Count number of occurences of samples with label\n            count = len(y[y == label])\n            if count > max_count:\n                most_common = label\n                max_count = count\n        return most_common\n\n    def fit(self, X, y):\n        self._impurity_calculation = self._calculate_information_gain\n        self._leaf_value_calculation = self._majority_vote\n        super(ClassificationTree, self).fit(X, y)\n"
  },
  {
    "path": "mlfromscratch/supervised_learning/gradient_boosting.py",
    "content": "from __future__ import division, print_function\nimport numpy as np\nimport progressbar\n\n# Import helper functions\nfrom mlfromscratch.utils import train_test_split, standardize, to_categorical\nfrom mlfromscratch.utils import mean_squared_error, accuracy_score\nfrom mlfromscratch.deep_learning.loss_functions import SquareLoss, CrossEntropy\nfrom mlfromscratch.supervised_learning.decision_tree import RegressionTree\nfrom mlfromscratch.utils.misc import bar_widgets\n\n\nclass GradientBoosting(object):\n    \"\"\"Super class of GradientBoostingClassifier and GradientBoostinRegressor. \n    Uses a collection of regression trees that trains on predicting the gradient\n    of the loss function. \n\n    Parameters:\n    -----------\n    n_estimators: int\n        The number of classification trees that are used.\n    learning_rate: float\n        The step length that will be taken when following the negative gradient during\n        training.\n    min_samples_split: int\n        The minimum number of samples needed to make a split when building a tree.\n    min_impurity: float\n        The minimum impurity required to split the tree further. \n    max_depth: int\n        The maximum depth of a tree.\n    regression: boolean\n        True or false depending on if we're doing regression or classification.\n    \"\"\"\n    def __init__(self, n_estimators, learning_rate, min_samples_split,\n                 min_impurity, max_depth, regression):\n        self.n_estimators = n_estimators\n        self.learning_rate = learning_rate\n        self.min_samples_split = min_samples_split\n        self.min_impurity = min_impurity\n        self.max_depth = max_depth\n        self.regression = regression\n        self.bar = progressbar.ProgressBar(widgets=bar_widgets)\n        \n        # Square loss for regression\n        # Log loss for classification\n        self.loss = SquareLoss()\n        if not self.regression:\n            self.loss = CrossEntropy()\n\n        # Initialize regression trees\n        self.trees = []\n        for _ in range(n_estimators):\n            tree = RegressionTree(\n                    min_samples_split=self.min_samples_split,\n                    min_impurity=min_impurity,\n                    max_depth=self.max_depth)\n            self.trees.append(tree)\n\n\n    def fit(self, X, y):\n        y_pred = np.full(np.shape(y), np.mean(y, axis=0))\n        for i in self.bar(range(self.n_estimators)):\n            gradient = self.loss.gradient(y, y_pred)\n            self.trees[i].fit(X, gradient)\n            update = self.trees[i].predict(X)\n            # Update y prediction\n            y_pred -= np.multiply(self.learning_rate, update)\n\n\n    def predict(self, X):\n        y_pred = np.array([])\n        # Make predictions\n        for tree in self.trees:\n            update = tree.predict(X)\n            update = np.multiply(self.learning_rate, update)\n            y_pred = -update if not y_pred.any() else y_pred - update\n\n        if not self.regression:\n            # Turn into probability distribution\n            y_pred = np.exp(y_pred) / np.expand_dims(np.sum(np.exp(y_pred), axis=1), axis=1)\n            # Set label to the value that maximizes probability\n            y_pred = np.argmax(y_pred, axis=1)\n        return y_pred\n\n\nclass GradientBoostingRegressor(GradientBoosting):\n    def __init__(self, n_estimators=200, learning_rate=0.5, min_samples_split=2,\n                 min_var_red=1e-7, max_depth=4, debug=False):\n        super(GradientBoostingRegressor, self).__init__(n_estimators=n_estimators, \n            learning_rate=learning_rate, \n            min_samples_split=min_samples_split, \n            min_impurity=min_var_red,\n            max_depth=max_depth,\n            regression=True)\n\nclass GradientBoostingClassifier(GradientBoosting):\n    def __init__(self, n_estimators=200, learning_rate=.5, min_samples_split=2,\n                 min_info_gain=1e-7, max_depth=2, debug=False):\n        super(GradientBoostingClassifier, self).__init__(n_estimators=n_estimators, \n            learning_rate=learning_rate, \n            min_samples_split=min_samples_split, \n            min_impurity=min_info_gain,\n            max_depth=max_depth,\n            regression=False)\n\n    def fit(self, X, y):\n        y = to_categorical(y)\n        super(GradientBoostingClassifier, self).fit(X, y)\n\n"
  },
  {
    "path": "mlfromscratch/supervised_learning/k_nearest_neighbors.py",
    "content": "from __future__ import print_function, division\nimport numpy as np\nfrom mlfromscratch.utils import euclidean_distance\n\nclass KNN():\n    \"\"\" K Nearest Neighbors classifier.\n\n    Parameters:\n    -----------\n    k: int\n        The number of closest neighbors that will determine the class of the \n        sample that we wish to predict.\n    \"\"\"\n    def __init__(self, k=5):\n        self.k = k\n\n    def _vote(self, neighbor_labels):\n        \"\"\" Return the most common class among the neighbor samples \"\"\"\n        counts = np.bincount(neighbor_labels.astype('int'))\n        return counts.argmax()\n\n    def predict(self, X_test, X_train, y_train):\n        y_pred = np.empty(X_test.shape[0])\n        # Determine the class of each sample\n        for i, test_sample in enumerate(X_test):\n            # Sort the training samples by their distance to the test sample and get the K nearest\n            idx = np.argsort([euclidean_distance(test_sample, x) for x in X_train])[:self.k]\n            # Extract the labels of the K nearest neighboring training samples\n            k_nearest_neighbors = np.array([y_train[i] for i in idx])\n            # Label sample as the most common class label\n            y_pred[i] = self._vote(k_nearest_neighbors)\n\n        return y_pred\n        "
  },
  {
    "path": "mlfromscratch/supervised_learning/linear_discriminant_analysis.py",
    "content": "from __future__ import print_function, division\nimport numpy as np\nfrom mlfromscratch.utils import calculate_covariance_matrix, normalize, standardize\n\nclass LDA():\n    \"\"\"The Linear Discriminant Analysis classifier, also known as Fisher's linear discriminant.\n    Can besides from classification also be used to reduce the dimensionaly of the dataset.\n    \"\"\"\n    def __init__(self):\n        self.w = None\n\n    def transform(self, X, y):\n        self.fit(X, y)\n        # Project data onto vector\n        X_transform = X.dot(self.w)\n        return X_transform\n\n    def fit(self, X, y):\n        # Separate data by class\n        X1 = X[y == 0]\n        X2 = X[y == 1]\n\n        # Calculate the covariance matrices of the two datasets\n        cov1 = calculate_covariance_matrix(X1)\n        cov2 = calculate_covariance_matrix(X2)\n        cov_tot = cov1 + cov2\n\n        # Calculate the mean of the two datasets\n        mean1 = X1.mean(0)\n        mean2 = X2.mean(0)\n        mean_diff = np.atleast_1d(mean1 - mean2)\n\n        # Determine the vector which when X is projected onto it best separates the\n        # data by class. w = (mean1 - mean2) / (cov1 + cov2)\n        self.w = np.linalg.pinv(cov_tot).dot(mean_diff)\n\n    def predict(self, X):\n        y_pred = []\n        for sample in X:\n            h = sample.dot(self.w)\n            y = 1 * (h < 0)\n            y_pred.append(y)\n        return y_pred\n"
  },
  {
    "path": "mlfromscratch/supervised_learning/logistic_regression.py",
    "content": "from __future__ import print_function, division\nimport numpy as np\nimport math\nfrom mlfromscratch.utils import make_diagonal, Plot\nfrom mlfromscratch.deep_learning.activation_functions import Sigmoid\n\n\nclass LogisticRegression():\n    \"\"\" Logistic Regression classifier.\n    Parameters:\n    -----------\n    learning_rate: float\n        The step length that will be taken when following the negative gradient during\n        training.\n    gradient_descent: boolean\n        True or false depending if gradient descent should be used when training. If\n        false then we use batch optimization by least squares.\n    \"\"\"\n    def __init__(self, learning_rate=.1, gradient_descent=True):\n        self.param = None\n        self.learning_rate = learning_rate\n        self.gradient_descent = gradient_descent\n        self.sigmoid = Sigmoid()\n\n    def _initialize_parameters(self, X):\n        n_features = np.shape(X)[1]\n        # Initialize parameters between [-1/sqrt(N), 1/sqrt(N)]\n        limit = 1 / math.sqrt(n_features)\n        self.param = np.random.uniform(-limit, limit, (n_features,))\n\n    def fit(self, X, y, n_iterations=4000):\n        self._initialize_parameters(X)\n        # Tune parameters for n iterations\n        for i in range(n_iterations):\n            # Make a new prediction\n            y_pred = self.sigmoid(X.dot(self.param))\n            if self.gradient_descent:\n                # Move against the gradient of the loss function with\n                # respect to the parameters to minimize the loss\n                self.param -= self.learning_rate * -(y - y_pred).dot(X)\n            else:\n                # Make a diagonal matrix of the sigmoid gradient column vector\n                diag_gradient = make_diagonal(self.sigmoid.gradient(X.dot(self.param)))\n                # Batch opt:\n                self.param = np.linalg.pinv(X.T.dot(diag_gradient).dot(X)).dot(X.T).dot(diag_gradient.dot(X).dot(self.param) + y - y_pred)\n\n    def predict(self, X):\n        y_pred = np.round(self.sigmoid(X.dot(self.param))).astype(int)\n        return y_pred\n"
  },
  {
    "path": "mlfromscratch/supervised_learning/multi_class_lda.py",
    "content": "from __future__ import print_function, division\nimport matplotlib.pyplot as plt\nimport numpy as np\nfrom mlfromscratch.utils import calculate_covariance_matrix, normalize, standardize\n\n\nclass MultiClassLDA():\n    \"\"\"Enables dimensionality reduction for multiple\n    class distributions. It transforms the features space into a space where\n    the between class scatter is maximized and the within class scatter is\n    minimized.\n\n    Parameters:\n    -----------\n    solver: str\n        If 'svd' we use the pseudo-inverse to calculate the inverse of matrices\n        when doing the transformation.\n    \"\"\"\n    def __init__(self, solver=\"svd\"):\n        self.solver = solver\n\n    def _calculate_scatter_matrices(self, X, y):\n        n_features = np.shape(X)[1]\n        labels = np.unique(y)\n\n        # Within class scatter matrix:\n        # SW = sum{ (X_for_class - mean_of_X_for_class)^2 }\n        #   <=> (n_samples_X_for_class - 1) * covar(X_for_class)\n        SW = np.empty((n_features, n_features))\n        for label in labels:\n            _X = X[y == label]\n            SW += (len(_X) - 1) * calculate_covariance_matrix(_X)\n\n        # Between class scatter:\n        # SB = sum{ n_samples_for_class * (mean_for_class - total_mean)^2 }\n        total_mean = np.mean(X, axis=0)\n        SB = np.empty((n_features, n_features))\n        for label in labels:\n            _X = X[y == label]\n            _mean = np.mean(_X, axis=0)\n            SB += len(_X) * (_mean - total_mean).dot((_mean - total_mean).T)\n\n        return SW, SB\n\n    def transform(self, X, y, n_components):\n        SW, SB = self._calculate_scatter_matrices(X, y)\n\n        # Determine SW^-1 * SB by calculating inverse of SW\n        A = np.linalg.inv(SW).dot(SB)\n\n        # Get eigenvalues and eigenvectors of SW^-1 * SB\n        eigenvalues, eigenvectors = np.linalg.eigh(A)\n\n        # Sort the eigenvalues and corresponding eigenvectors from largest\n        # to smallest eigenvalue and select the first n_components\n        idx = eigenvalues.argsort()[::-1]\n        eigenvalues = eigenvalues[idx][:n_components]\n        eigenvectors = eigenvectors[:, idx][:, :n_components]\n\n        # Project the data onto eigenvectors\n        X_transformed = X.dot(eigenvectors)\n\n        return X_transformed\n\n\n    def plot_in_2d(self, X, y, title=None):\n        \"\"\" Plot the dataset X and the corresponding labels y in 2D using the LDA\n        transformation.\"\"\"\n        X_transformed = self.transform(X, y, n_components=2)\n        x1 = X_transformed[:, 0]\n        x2 = X_transformed[:, 1]\n        plt.scatter(x1, x2, c=y)\n        if title: plt.title(title)\n        plt.show()\n"
  },
  {
    "path": "mlfromscratch/supervised_learning/multilayer_perceptron.py",
    "content": "from __future__ import print_function, division\nimport numpy as np\nimport math\nfrom sklearn import datasets\n\nfrom mlfromscratch.utils import train_test_split, to_categorical, normalize, accuracy_score, Plot\nfrom mlfromscratch.deep_learning.activation_functions import Sigmoid, Softmax\nfrom mlfromscratch.deep_learning.loss_functions import CrossEntropy\n\nclass MultilayerPerceptron():\n    \"\"\"Multilayer Perceptron classifier. A fully-connected neural network with one hidden layer.\n    Unrolled to display the whole forward and backward pass.\n\n    Parameters:\n    -----------\n    n_hidden: int:\n        The number of processing nodes (neurons) in the hidden layer. \n    n_iterations: float\n        The number of training iterations the algorithm will tune the weights for.\n    learning_rate: float\n        The step length that will be used when updating the weights.\n    \"\"\"\n    def __init__(self, n_hidden, n_iterations=3000, learning_rate=0.01):\n        self.n_hidden = n_hidden\n        self.n_iterations = n_iterations\n        self.learning_rate = learning_rate\n        self.hidden_activation = Sigmoid()\n        self.output_activation = Softmax()\n        self.loss = CrossEntropy()\n\n    def _initialize_weights(self, X, y):\n        n_samples, n_features = X.shape\n        _, n_outputs = y.shape\n        # Hidden layer\n        limit   = 1 / math.sqrt(n_features)\n        self.W  = np.random.uniform(-limit, limit, (n_features, self.n_hidden))\n        self.w0 = np.zeros((1, self.n_hidden))\n        # Output layer\n        limit   = 1 / math.sqrt(self.n_hidden)\n        self.V  = np.random.uniform(-limit, limit, (self.n_hidden, n_outputs))\n        self.v0 = np.zeros((1, n_outputs))\n\n    def fit(self, X, y):\n\n        self._initialize_weights(X, y)\n\n        for i in range(self.n_iterations):\n\n            # ..............\n            #  Forward Pass\n            # ..............\n\n            # HIDDEN LAYER\n            hidden_input = X.dot(self.W) + self.w0\n            hidden_output = self.hidden_activation(hidden_input)\n            # OUTPUT LAYER\n            output_layer_input = hidden_output.dot(self.V) + self.v0\n            y_pred = self.output_activation(output_layer_input)\n\n            # ...............\n            #  Backward Pass\n            # ...............\n\n            # OUTPUT LAYER\n            # Grad. w.r.t input of output layer\n            grad_wrt_out_l_input = self.loss.gradient(y, y_pred) * self.output_activation.gradient(output_layer_input)\n            grad_v = hidden_output.T.dot(grad_wrt_out_l_input)\n            grad_v0 = np.sum(grad_wrt_out_l_input, axis=0, keepdims=True)\n            # HIDDEN LAYER\n            # Grad. w.r.t input of hidden layer\n            grad_wrt_hidden_l_input = grad_wrt_out_l_input.dot(self.V.T) * self.hidden_activation.gradient(hidden_input)\n            grad_w = X.T.dot(grad_wrt_hidden_l_input)\n            grad_w0 = np.sum(grad_wrt_hidden_l_input, axis=0, keepdims=True)\n\n            # Update weights (by gradient descent)\n            # Move against the gradient to minimize loss\n            self.V  -= self.learning_rate * grad_v\n            self.v0 -= self.learning_rate * grad_v0\n            self.W  -= self.learning_rate * grad_w\n            self.w0 -= self.learning_rate * grad_w0\n\n    # Use the trained model to predict labels of X\n    def predict(self, X):\n        # Forward pass:\n        hidden_input = X.dot(self.W) + self.w0\n        hidden_output = self.hidden_activation(hidden_input)\n        output_layer_input = hidden_output.dot(self.V) + self.v0\n        y_pred = self.output_activation(output_layer_input)\n        return y_pred\n\n\ndef main():\n    data = datasets.load_digits()\n    X = normalize(data.data)\n    y = data.target\n\n    # Convert the nominal y values to binary\n    y = to_categorical(y)\n\n    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, seed=1)\n\n    # MLP\n    clf = MultilayerPerceptron(n_hidden=16,\n        n_iterations=1000,\n        learning_rate=0.01)\n\n    clf.fit(X_train, y_train)\n    y_pred = np.argmax(clf.predict(X_test), axis=1)\n    y_test = np.argmax(y_test, axis=1)\n\n    accuracy = accuracy_score(y_test, y_pred)\n    print (\"Accuracy:\", accuracy)\n\n    # Reduce dimension to two using PCA and plot the results\n    Plot().plot_in_2d(X_test, y_pred, title=\"Multilayer Perceptron\", accuracy=accuracy, legend_labels=np.unique(y))\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "mlfromscratch/supervised_learning/naive_bayes.py",
    "content": "from __future__ import division, print_function\nimport numpy as np\nimport math\nfrom mlfromscratch.utils import train_test_split, normalize\nfrom mlfromscratch.utils import Plot, accuracy_score\n\nclass NaiveBayes():\n    \"\"\"The Gaussian Naive Bayes classifier. \"\"\"\n    def fit(self, X, y):\n        self.X, self.y = X, y\n        self.classes = np.unique(y)\n        self.parameters = []\n        # Calculate the mean and variance of each feature for each class\n        for i, c in enumerate(self.classes):\n            # Only select the rows where the label equals the given class\n            X_where_c = X[np.where(y == c)]\n            self.parameters.append([])\n            # Add the mean and variance for each feature (column)\n            for col in X_where_c.T:\n                parameters = {\"mean\": col.mean(), \"var\": col.var()}\n                self.parameters[i].append(parameters)\n\n    def _calculate_likelihood(self, mean, var, x):\n        \"\"\" Gaussian likelihood of the data x given mean and var \"\"\"\n        eps = 1e-4 # Added in denominator to prevent division by zero\n        coeff = 1.0 / math.sqrt(2.0 * math.pi * var + eps)\n        exponent = math.exp(-(math.pow(x - mean, 2) / (2 * var + eps)))\n        return coeff * exponent\n\n    def _calculate_prior(self, c):\n        \"\"\" Calculate the prior of class c\n        (samples where class == c / total number of samples)\"\"\"\n        frequency = np.mean(self.y == c)\n        return frequency\n\n    def _classify(self, sample):\n        \"\"\" Classification using Bayes Rule P(Y|X) = P(X|Y)*P(Y)/P(X),\n            or Posterior = Likelihood * Prior / Scaling Factor\n\n        P(Y|X) - The posterior is the probability that sample x is of class y given the\n                 feature values of x being distributed according to distribution of y and the prior.\n        P(X|Y) - Likelihood of data X given class distribution Y.\n                 Gaussian distribution (given by _calculate_likelihood)\n        P(Y)   - Prior (given by _calculate_prior)\n        P(X)   - Scales the posterior to make it a proper probability distribution.\n                 This term is ignored in this implementation since it doesn't affect\n                 which class distribution the sample is most likely to belong to.\n\n        Classifies the sample as the class that results in the largest P(Y|X) (posterior)\n        \"\"\"\n        posteriors = []\n        # Go through list of classes\n        for i, c in enumerate(self.classes):\n            # Initialize posterior as prior\n            posterior = self._calculate_prior(c)\n            # Naive assumption (independence):\n            # P(x1,x2,x3|Y) = P(x1|Y)*P(x2|Y)*P(x3|Y)\n            # Posterior is product of prior and likelihoods (ignoring scaling factor)\n            for feature_value, params in zip(sample, self.parameters[i]):\n                # Likelihood of feature value given distribution of feature values given y\n                likelihood = self._calculate_likelihood(params[\"mean\"], params[\"var\"], feature_value)\n                posterior *= likelihood\n            posteriors.append(posterior)\n        # Return the class with the largest posterior probability\n        return self.classes[np.argmax(posteriors)]\n\n    def predict(self, X):\n        \"\"\" Predict the class labels of the samples in X \"\"\"\n        y_pred = [self._classify(sample) for sample in X]\n        return y_pred\n"
  },
  {
    "path": "mlfromscratch/supervised_learning/neuroevolution.py",
    "content": "from __future__ import print_function, division\nimport numpy as np\nimport copy\n\nclass Neuroevolution():\n    \"\"\" Evolutionary optimization of Neural Networks.\n\n    Parameters:\n    -----------\n    n_individuals: int\n        The number of neural networks that are allowed in the population at a time.\n    mutation_rate: float\n        The probability that a weight will be mutated.\n    model_builder: method\n        A method which returns a user specified NeuralNetwork instance. \n    \"\"\"\n    def __init__(self, population_size, mutation_rate, model_builder):\n        self.population_size = population_size\n        self.mutation_rate = mutation_rate\n        self.model_builder = model_builder\n\n    def _build_model(self, id):\n        \"\"\" Returns a new individual \"\"\"\n        model = self.model_builder(n_inputs=self.X.shape[1], n_outputs=self.y.shape[1])\n        model.id = id\n        model.fitness = 0\n        model.accuracy = 0\n        \n        return model\n\n    def _initialize_population(self):\n        \"\"\" Initialization of the neural networks forming the population\"\"\"\n        self.population = []\n        for _ in range(self.population_size):\n            model = self._build_model(id=np.random.randint(1000))\n            self.population.append(model)\n\n    def _mutate(self, individual, var=1):\n        \"\"\" Add zero mean gaussian noise to the layer weights with probability mutation_rate \"\"\"\n        for layer in individual.layers:\n            if hasattr(layer, 'W'):\n                # Mutation of weight with probability self.mutation_rate\n                mutation_mask = np.random.binomial(1, p=self.mutation_rate, size=layer.W.shape)\n                layer.W += np.random.normal(loc=0, scale=var, size=layer.W.shape) * mutation_mask\n                mutation_mask = np.random.binomial(1, p=self.mutation_rate, size=layer.w0.shape)\n                layer.w0 += np.random.normal(loc=0, scale=var, size=layer.w0.shape) * mutation_mask\n        \n        return individual\n\n    def _inherit_weights(self, child, parent):\n        \"\"\" Copies the weights from parent to child \"\"\"\n        for i in range(len(child.layers)):\n            if hasattr(child.layers[i], 'W'):\n                # The child inherits both weights W and bias weights w0\n                child.layers[i].W = parent.layers[i].W.copy()\n                child.layers[i].w0 = parent.layers[i].w0.copy()\n\n    def _crossover(self, parent1, parent2):\n        \"\"\" Performs crossover between the neurons in parent1 and parent2 to form offspring \"\"\"\n        child1 = self._build_model(id=parent1.id+1)\n        self._inherit_weights(child1, parent1)\n        child2 = self._build_model(id=parent2.id+1)\n        self._inherit_weights(child2, parent2)\n\n        # Perform crossover\n        for i in range(len(child1.layers)):\n            if hasattr(child1.layers[i], 'W'):\n                n_neurons = child1.layers[i].W.shape[1]\n                # Perform crossover between the individuals' neuron weights\n                cutoff = np.random.randint(0, n_neurons)\n                child1.layers[i].W[:, cutoff:] = parent2.layers[i].W[:, cutoff:].copy()\n                child1.layers[i].w0[:, cutoff:] = parent2.layers[i].w0[:, cutoff:].copy()\n                child2.layers[i].W[:, cutoff:] = parent1.layers[i].W[:, cutoff:].copy()\n                child2.layers[i].w0[:, cutoff:] = parent1.layers[i].w0[:, cutoff:].copy()\n        \n        return child1, child2\n\n    def _calculate_fitness(self):\n        \"\"\" Evaluate the NNs on the test set to get fitness scores \"\"\"\n        for individual in self.population:\n            loss, acc = individual.test_on_batch(self.X, self.y)\n            individual.fitness = 1 / (loss + 1e-8)\n            individual.accuracy = acc\n\n    def evolve(self, X, y, n_generations):\n        \"\"\" Will evolve the population for n_generations based on dataset X and labels y\"\"\"\n        self.X, self.y = X, y\n\n        self._initialize_population()\n\n        # The 40% highest fittest individuals will be selected for the next generation\n        n_winners = int(self.population_size * 0.4)\n        # The fittest 60% of the population will be selected as parents to form offspring\n        n_parents = self.population_size - n_winners\n\n        for epoch in range(n_generations):\n            # Determine the fitness of the individuals in the population\n            self._calculate_fitness()\n\n            # Sort population by fitness\n            sorted_i = np.argsort([model.fitness for model in self.population])[::-1]\n            self.population = [self.population[i] for i in sorted_i]\n\n            # Get the individual with the highest fitness\n            fittest_individual = self.population[0]\n            print (\"[%d Best Individual - Fitness: %.5f, Accuracy: %.1f%%]\" % (epoch, \n                                                                        fittest_individual.fitness, \n                                                                        float(100*fittest_individual.accuracy)))\n            # The 'winners' are selected for the next generation\n            next_population = [self.population[i] for i in range(n_winners)]\n\n            total_fitness = np.sum([model.fitness for model in self.population])\n            # The probability that a individual will be selected as a parent is proportionate to its fitness\n            parent_probabilities = [model.fitness / total_fitness for model in self.population]\n            # Select parents according to probabilities (without replacement to preserve diversity)\n            parents = np.random.choice(self.population, size=n_parents, p=parent_probabilities, replace=False)\n            for i in np.arange(0, len(parents), 2):\n                # Perform crossover to produce offspring\n                child1, child2 = self._crossover(parents[i], parents[i+1])\n                # Save mutated offspring for next population\n                next_population += [self._mutate(child1), self._mutate(child2)]\n\n            self.population = next_population\n\n        return fittest_individual\n\n"
  },
  {
    "path": "mlfromscratch/supervised_learning/particle_swarm_optimization.py",
    "content": "from __future__ import print_function, division\nimport numpy as np\nimport copy\n\nclass ParticleSwarmOptimizedNN():\n    \"\"\" Particle Swarm Optimization of Neural Network.\n\n    Parameters:\n    -----------\n    n_individuals: int\n        The number of neural networks that are allowed in the population at a time.\n    model_builder: method\n        A method which returns a user specified NeuralNetwork instance.\n    inertia_weight:     float [0,1)\n    cognitive_weight:   float [0,1)\n    social_weight:      float [0,1)\n    max_velocity: float\n        The maximum allowed value for the velocity.\n\n    Reference:\n        Neural Network Training Using Particle Swarm Optimization\n        https://visualstudiomagazine.com/articles/2013/12/01/neural-network-training-using-particle-swarm-optimization.aspx \n    \"\"\"\n    def __init__(self, population_size, \n                        model_builder, \n                        inertia_weight=0.8, \n                        cognitive_weight=2, \n                        social_weight=2, \n                        max_velocity=20):\n        self.population_size = population_size\n        self.model_builder = model_builder\n        self.best_individual = None\n        # Parameters used to update velocity\n        self.cognitive_w = cognitive_weight\n        self.inertia_w = inertia_weight\n        self.social_w = social_weight\n        self.min_v = -max_velocity\n        self.max_v = max_velocity\n\n    def _build_model(self, id):\n        \"\"\" Returns a new individual \"\"\"\n        model = self.model_builder(n_inputs=self.X.shape[1], n_outputs=self.y.shape[1])\n        model.id = id\n        model.fitness = 0\n        model.highest_fitness = 0\n        model.accuracy = 0\n        # Set intial best as the current initialization\n        model.best_layers = copy.copy(model.layers)\n\n        # Set initial velocity to zero\n        model.velocity = []\n        for layer in model.layers:\n            velocity = {\"W\": 0, \"w0\": 0}\n            if hasattr(layer, 'W'):\n                velocity = {\"W\": np.zeros_like(layer.W), \"w0\": np.zeros_like(layer.w0)}\n            model.velocity.append(velocity)\n\n        return model\n\n    def _initialize_population(self):\n        \"\"\" Initialization of the neural networks forming the population\"\"\"\n        self.population = []\n        for i in range(self.population_size):\n            model = self._build_model(id=i)\n            self.population.append(model)\n\n    def _update_weights(self, individual):\n        \"\"\" Calculate the new velocity and update weights for each layer \"\"\"\n        # Two random parameters used to update the velocity\n        r1 = np.random.uniform()\n        r2 = np.random.uniform()\n        for i, layer in enumerate(individual.layers):\n            if hasattr(layer, 'W'):\n                # Layer weights velocity\n                first_term_W = self.inertia_w * individual.velocity[i][\"W\"]\n                second_term_W = self.cognitive_w * r1 * (individual.best_layers[i].W - layer.W)\n                third_term_W = self.social_w * r2 * (self.best_individual.layers[i].W - layer.W)\n                new_velocity = first_term_W + second_term_W + third_term_W\n                individual.velocity[i][\"W\"] = np.clip(new_velocity, self.min_v, self.max_v)\n\n                # Bias weight velocity\n                first_term_w0 = self.inertia_w * individual.velocity[i][\"w0\"]\n                second_term_w0 = self.cognitive_w * r1 * (individual.best_layers[i].w0 - layer.w0)\n                third_term_w0 = self.social_w * r2 * (self.best_individual.layers[i].w0 - layer.w0)\n                new_velocity = first_term_w0 + second_term_w0 + third_term_w0\n                individual.velocity[i][\"w0\"] = np.clip(new_velocity, self.min_v, self.max_v)\n\n                # Update layer weights with velocity\n                individual.layers[i].W += individual.velocity[i][\"W\"]\n                individual.layers[i].w0 += individual.velocity[i][\"w0\"]\n        \n    def _calculate_fitness(self, individual):\n        \"\"\" Evaluate the individual on the test set to get fitness scores \"\"\"\n        loss, acc = individual.test_on_batch(self.X, self.y)\n        individual.fitness = 1 / (loss + 1e-8)\n        individual.accuracy = acc\n\n    def evolve(self, X, y, n_generations):\n        \"\"\" Will evolve the population for n_generations based on dataset X and labels y\"\"\"\n        self.X, self.y = X, y\n\n        self._initialize_population()\n\n        # The best individual of the population is initialized as population's first ind.\n        self.best_individual = copy.copy(self.population[0])\n\n        for epoch in range(n_generations):\n            for individual in self.population:\n                # Calculate new velocity and update the NN weights\n                self._update_weights(individual)\n                # Calculate the fitness of the updated individual\n                self._calculate_fitness(individual)\n\n                # If the current fitness is higher than the individual's previous highest\n                # => update the individual's best layer setup\n                if individual.fitness > individual.highest_fitness:\n                    individual.best_layers = copy.copy(individual.layers)\n                    individual.highest_fitness = individual.fitness\n                # If the individual's fitness is higher than the highest recorded fitness for the\n                # whole population => update the best individual\n                if individual.fitness > self.best_individual.fitness:\n                    self.best_individual = copy.copy(individual)\n\n            print (\"[%d Best Individual - ID: %d Fitness: %.5f, Accuracy: %.1f%%]\" % (epoch,\n                                                                            self.best_individual.id,\n                                                                            self.best_individual.fitness,\n                                                                            100*float(self.best_individual.accuracy)))\n        return self.best_individual\n\n"
  },
  {
    "path": "mlfromscratch/supervised_learning/perceptron.py",
    "content": "from __future__ import print_function, division\nimport math\nimport numpy as np\n\n# Import helper functions\nfrom mlfromscratch.utils import train_test_split, to_categorical, normalize, accuracy_score\nfrom mlfromscratch.deep_learning.activation_functions import Sigmoid, ReLU, SoftPlus, LeakyReLU, TanH, ELU\nfrom mlfromscratch.deep_learning.loss_functions import CrossEntropy, SquareLoss\nfrom mlfromscratch.utils import Plot\nfrom mlfromscratch.utils.misc import bar_widgets\nimport progressbar\n\nclass Perceptron():\n    \"\"\"The Perceptron. One layer neural network classifier.\n\n    Parameters:\n    -----------\n    n_iterations: float\n        The number of training iterations the algorithm will tune the weights for.\n    activation_function: class\n        The activation that shall be used for each neuron.\n        Possible choices: Sigmoid, ExpLU, ReLU, LeakyReLU, SoftPlus, TanH\n    loss: class\n        The loss function used to assess the model's performance.\n        Possible choices: SquareLoss, CrossEntropy\n    learning_rate: float\n        The step length that will be used when updating the weights.\n    \"\"\"\n    def __init__(self, n_iterations=20000, activation_function=Sigmoid, loss=SquareLoss, learning_rate=0.01):\n        self.n_iterations = n_iterations\n        self.learning_rate = learning_rate\n        self.loss = loss()\n        self.activation_func = activation_function()\n        self.progressbar = progressbar.ProgressBar(widgets=bar_widgets)\n\n    def fit(self, X, y):\n        n_samples, n_features = np.shape(X)\n        _, n_outputs = np.shape(y)\n\n        # Initialize weights between [-1/sqrt(N), 1/sqrt(N)]\n        limit = 1 / math.sqrt(n_features)\n        self.W = np.random.uniform(-limit, limit, (n_features, n_outputs))\n        self.w0 = np.zeros((1, n_outputs))\n\n        for i in self.progressbar(range(self.n_iterations)):\n            # Calculate outputs\n            linear_output = X.dot(self.W) + self.w0\n            y_pred = self.activation_func(linear_output)\n            # Calculate the loss gradient w.r.t the input of the activation function\n            error_gradient = self.loss.gradient(y, y_pred) * self.activation_func.gradient(linear_output)\n            # Calculate the gradient of the loss with respect to each weight\n            grad_wrt_w = X.T.dot(error_gradient)\n            grad_wrt_w0 = np.sum(error_gradient, axis=0, keepdims=True)\n            # Update weights\n            self.W  -= self.learning_rate * grad_wrt_w\n            self.w0 -= self.learning_rate  * grad_wrt_w0\n\n    # Use the trained model to predict labels of X\n    def predict(self, X):\n        y_pred = self.activation_func(X.dot(self.W) + self.w0)\n        return y_pred\n"
  },
  {
    "path": "mlfromscratch/supervised_learning/random_forest.py",
    "content": "from __future__ import division, print_function\nimport numpy as np\nimport math\nimport progressbar\n\n# Import helper functions\nfrom mlfromscratch.utils import divide_on_feature, train_test_split, get_random_subsets, normalize\nfrom mlfromscratch.utils import accuracy_score, calculate_entropy\nfrom mlfromscratch.unsupervised_learning import PCA\nfrom mlfromscratch.supervised_learning import ClassificationTree\nfrom mlfromscratch.utils.misc import bar_widgets\nfrom mlfromscratch.utils import Plot\n\n\nclass RandomForest():\n    \"\"\"Random Forest classifier. Uses a collection of classification trees that\n    trains on random subsets of the data using a random subsets of the features.\n\n    Parameters:\n    -----------\n    n_estimators: int\n        The number of classification trees that are used.\n    max_features: int\n        The maximum number of features that the classification trees are allowed to\n        use.\n    min_samples_split: int\n        The minimum number of samples needed to make a split when building a tree.\n    min_gain: float\n        The minimum impurity required to split the tree further. \n    max_depth: int\n        The maximum depth of a tree.\n    \"\"\"\n    def __init__(self, n_estimators=100, max_features=None, min_samples_split=2,\n                 min_gain=0, max_depth=float(\"inf\")):\n        self.n_estimators = n_estimators    # Number of trees\n        self.max_features = max_features    # Maxmimum number of features per tree\n        self.min_samples_split = min_samples_split\n        self.min_gain = min_gain            # Minimum information gain req. to continue\n        self.max_depth = max_depth          # Maximum depth for tree\n        self.progressbar = progressbar.ProgressBar(widgets=bar_widgets)\n\n        # Initialize decision trees\n        self.trees = []\n        for _ in range(n_estimators):\n            self.trees.append(\n                ClassificationTree(\n                    min_samples_split=self.min_samples_split,\n                    min_impurity=min_gain,\n                    max_depth=self.max_depth))\n\n    def fit(self, X, y):\n        n_features = np.shape(X)[1]\n        # If max_features have not been defined => select it as\n        # sqrt(n_features)\n        if not self.max_features:\n            self.max_features = int(math.sqrt(n_features))\n\n        # Choose one random subset of the data for each tree\n        subsets = get_random_subsets(X, y, self.n_estimators)\n\n        for i in self.progressbar(range(self.n_estimators)):\n            X_subset, y_subset = subsets[i]\n            # Feature bagging (select random subsets of the features)\n            idx = np.random.choice(range(n_features), size=self.max_features, replace=True)\n            # Save the indices of the features for prediction\n            self.trees[i].feature_indices = idx\n            # Choose the features corresponding to the indices\n            X_subset = X_subset[:, idx]\n            # Fit the tree to the data\n            self.trees[i].fit(X_subset, y_subset)\n\n    def predict(self, X):\n        y_preds = np.empty((X.shape[0], len(self.trees)))\n        # Let each tree make a prediction on the data\n        for i, tree in enumerate(self.trees):\n            # Indices of the features that the tree has trained on\n            idx = tree.feature_indices\n            # Make a prediction based on those features\n            prediction = tree.predict(X[:, idx])\n            y_preds[:, i] = prediction\n            \n        y_pred = []\n        # For each sample\n        for sample_predictions in y_preds:\n            # Select the most common class prediction\n            y_pred.append(np.bincount(sample_predictions.astype('int')).argmax())\n        return y_pred\n"
  },
  {
    "path": "mlfromscratch/supervised_learning/regression.py",
    "content": "from __future__ import print_function, division\nimport numpy as np\nimport math\nfrom mlfromscratch.utils import normalize, polynomial_features\n\nclass l1_regularization():\n    \"\"\" Regularization for Lasso Regression \"\"\"\n    def __init__(self, alpha):\n        self.alpha = alpha\n    \n    def __call__(self, w):\n        return self.alpha * np.linalg.norm(w)\n\n    def grad(self, w):\n        return self.alpha * np.sign(w)\n\nclass l2_regularization():\n    \"\"\" Regularization for Ridge Regression \"\"\"\n    def __init__(self, alpha):\n        self.alpha = alpha\n    \n    def __call__(self, w):\n        return self.alpha * 0.5 *  w.T.dot(w)\n\n    def grad(self, w):\n        return self.alpha * w\n\nclass l1_l2_regularization():\n    \"\"\" Regularization for Elastic Net Regression \"\"\"\n    def __init__(self, alpha, l1_ratio=0.5):\n        self.alpha = alpha\n        self.l1_ratio = l1_ratio\n\n    def __call__(self, w):\n        l1_contr = self.l1_ratio * np.linalg.norm(w)\n        l2_contr = (1 - self.l1_ratio) * 0.5 * w.T.dot(w) \n        return self.alpha * (l1_contr + l2_contr)\n\n    def grad(self, w):\n        l1_contr = self.l1_ratio * np.sign(w)\n        l2_contr = (1 - self.l1_ratio) * w\n        return self.alpha * (l1_contr + l2_contr) \n\nclass Regression(object):\n    \"\"\" Base regression model. Models the relationship between a scalar dependent variable y and the independent \n    variables X. \n    Parameters:\n    -----------\n    n_iterations: float\n        The number of training iterations the algorithm will tune the weights for.\n    learning_rate: float\n        The step length that will be used when updating the weights.\n    \"\"\"\n    def __init__(self, n_iterations, learning_rate):\n        self.n_iterations = n_iterations\n        self.learning_rate = learning_rate\n\n    def initialize_weights(self, n_features):\n        \"\"\" Initialize weights randomly [-1/N, 1/N] \"\"\"\n        limit = 1 / math.sqrt(n_features)\n        self.w = np.random.uniform(-limit, limit, (n_features, ))\n\n    def fit(self, X, y):\n        # Insert constant ones for bias weights\n        X = np.insert(X, 0, 1, axis=1)\n        self.training_errors = []\n        self.initialize_weights(n_features=X.shape[1])\n\n        # Do gradient descent for n_iterations\n        for i in range(self.n_iterations):\n            y_pred = X.dot(self.w)\n            # Calculate l2 loss\n            mse = np.mean(0.5 * (y - y_pred)**2 + self.regularization(self.w))\n            self.training_errors.append(mse)\n            # Gradient of l2 loss w.r.t w\n            grad_w = -(y - y_pred).dot(X) + self.regularization.grad(self.w)\n            # Update the weights\n            self.w -= self.learning_rate * grad_w\n\n    def predict(self, X):\n        # Insert constant ones for bias weights\n        X = np.insert(X, 0, 1, axis=1)\n        y_pred = X.dot(self.w)\n        return y_pred\n\nclass LinearRegression(Regression):\n    \"\"\"Linear model.\n    Parameters:\n    -----------\n    n_iterations: float\n        The number of training iterations the algorithm will tune the weights for.\n    learning_rate: float\n        The step length that will be used when updating the weights.\n    gradient_descent: boolean\n        True or false depending if gradient descent should be used when training. If \n        false then we use batch optimization by least squares.\n    \"\"\"\n    def __init__(self, n_iterations=100, learning_rate=0.001, gradient_descent=True):\n        self.gradient_descent = gradient_descent\n        # No regularization\n        self.regularization = lambda x: 0\n        self.regularization.grad = lambda x: 0\n        super(LinearRegression, self).__init__(n_iterations=n_iterations,\n                                            learning_rate=learning_rate)\n    def fit(self, X, y):\n        # If not gradient descent => Least squares approximation of w\n        if not self.gradient_descent:\n            # Insert constant ones for bias weights\n            X = np.insert(X, 0, 1, axis=1)\n            # Calculate weights by least squares (using Moore-Penrose pseudoinverse)\n            U, S, V = np.linalg.svd(X.T.dot(X))\n            S = np.diag(S)\n            X_sq_reg_inv = V.dot(np.linalg.pinv(S)).dot(U.T)\n            self.w = X_sq_reg_inv.dot(X.T).dot(y)\n        else:\n            super(LinearRegression, self).fit(X, y)\n\nclass LassoRegression(Regression):\n    \"\"\"Linear regression model with a regularization factor which does both variable selection \n    and regularization. Model that tries to balance the fit of the model with respect to the training \n    data and the complexity of the model. A large regularization factor with decreases the variance of \n    the model and do para.\n    Parameters:\n    -----------\n    degree: int\n        The degree of the polynomial that the independent variable X will be transformed to.\n    reg_factor: float\n        The factor that will determine the amount of regularization and feature\n        shrinkage. \n    n_iterations: float\n        The number of training iterations the algorithm will tune the weights for.\n    learning_rate: float\n        The step length that will be used when updating the weights.\n    \"\"\"\n    def __init__(self, degree, reg_factor, n_iterations=3000, learning_rate=0.01):\n        self.degree = degree\n        self.regularization = l1_regularization(alpha=reg_factor)\n        super(LassoRegression, self).__init__(n_iterations, \n                                            learning_rate)\n\n    def fit(self, X, y):\n        X = normalize(polynomial_features(X, degree=self.degree))\n        super(LassoRegression, self).fit(X, y)\n\n    def predict(self, X):\n        X = normalize(polynomial_features(X, degree=self.degree))\n        return super(LassoRegression, self).predict(X)\n\nclass PolynomialRegression(Regression):\n    \"\"\"Performs a non-linear transformation of the data before fitting the model\n    and doing predictions which allows for doing non-linear regression.\n    Parameters:\n    -----------\n    degree: int\n        The degree of the polynomial that the independent variable X will be transformed to.\n    n_iterations: float\n        The number of training iterations the algorithm will tune the weights for.\n    learning_rate: float\n        The step length that will be used when updating the weights.\n    \"\"\"\n    def __init__(self, degree, n_iterations=3000, learning_rate=0.001):\n        self.degree = degree\n        # No regularization\n        self.regularization = lambda x: 0\n        self.regularization.grad = lambda x: 0\n        super(PolynomialRegression, self).__init__(n_iterations=n_iterations,\n                                                learning_rate=learning_rate)\n\n    def fit(self, X, y):\n        X = polynomial_features(X, degree=self.degree)\n        super(PolynomialRegression, self).fit(X, y)\n\n    def predict(self, X):\n        X = polynomial_features(X, degree=self.degree)\n        return super(PolynomialRegression, self).predict(X)\n\nclass RidgeRegression(Regression):\n    \"\"\"Also referred to as Tikhonov regularization. Linear regression model with a regularization factor.\n    Model that tries to balance the fit of the model with respect to the training data and the complexity\n    of the model. A large regularization factor with decreases the variance of the model.\n    Parameters:\n    -----------\n    reg_factor: float\n        The factor that will determine the amount of regularization and feature\n        shrinkage. \n    n_iterations: float\n        The number of training iterations the algorithm will tune the weights for.\n    learning_rate: float\n        The step length that will be used when updating the weights.\n    \"\"\"\n    def __init__(self, reg_factor, n_iterations=1000, learning_rate=0.001):\n        self.regularization = l2_regularization(alpha=reg_factor)\n        super(RidgeRegression, self).__init__(n_iterations, \n                                            learning_rate)\n\nclass PolynomialRidgeRegression(Regression):\n    \"\"\"Similar to regular ridge regression except that the data is transformed to allow\n    for polynomial regression.\n    Parameters:\n    -----------\n    degree: int\n        The degree of the polynomial that the independent variable X will be transformed to.\n    reg_factor: float\n        The factor that will determine the amount of regularization and feature\n        shrinkage. \n    n_iterations: float\n        The number of training iterations the algorithm will tune the weights for.\n    learning_rate: float\n        The step length that will be used when updating the weights.\n    \"\"\"\n    def __init__(self, degree, reg_factor, n_iterations=3000, learning_rate=0.01, gradient_descent=True):\n        self.degree = degree\n        self.regularization = l2_regularization(alpha=reg_factor)\n        super(PolynomialRidgeRegression, self).__init__(n_iterations, \n                                                        learning_rate)\n\n    def fit(self, X, y):\n        X = normalize(polynomial_features(X, degree=self.degree))\n        super(PolynomialRidgeRegression, self).fit(X, y)\n\n    def predict(self, X):\n        X = normalize(polynomial_features(X, degree=self.degree))\n        return super(PolynomialRidgeRegression, self).predict(X)\n\nclass ElasticNet(Regression):\n    \"\"\" Regression where a combination of l1 and l2 regularization are used. The\n    ratio of their contributions are set with the 'l1_ratio' parameter.\n    Parameters:\n    -----------\n    degree: int\n        The degree of the polynomial that the independent variable X will be transformed to.\n    reg_factor: float\n        The factor that will determine the amount of regularization and feature\n        shrinkage. \n    l1_ration: float\n        Weighs the contribution of l1 and l2 regularization.\n    n_iterations: float\n        The number of training iterations the algorithm will tune the weights for.\n    learning_rate: float\n        The step length that will be used when updating the weights.\n    \"\"\"\n    def __init__(self, degree=1, reg_factor=0.05, l1_ratio=0.5, n_iterations=3000, \n                learning_rate=0.01):\n        self.degree = degree\n        self.regularization = l1_l2_regularization(alpha=reg_factor, l1_ratio=l1_ratio)\n        super(ElasticNet, self).__init__(n_iterations, \n                                        learning_rate)\n\n    def fit(self, X, y):\n        X = normalize(polynomial_features(X, degree=self.degree))\n        super(ElasticNet, self).fit(X, y)\n\n    def predict(self, X):\n        X = normalize(polynomial_features(X, degree=self.degree))\n        return super(ElasticNet, self).predict(X)\n"
  },
  {
    "path": "mlfromscratch/supervised_learning/support_vector_machine.py",
    "content": "\nfrom __future__ import division, print_function\nimport numpy as np\nimport cvxopt\nfrom mlfromscratch.utils import train_test_split, normalize, accuracy_score\nfrom mlfromscratch.utils.kernels import *\nfrom mlfromscratch.utils import Plot\n\n# Hide cvxopt output\ncvxopt.solvers.options['show_progress'] = False\n\nclass SupportVectorMachine(object):\n    \"\"\"The Support Vector Machine classifier.\n    Uses cvxopt to solve the quadratic optimization problem.\n\n    Parameters:\n    -----------\n    C: float\n        Penalty term.\n    kernel: function\n        Kernel function. Can be either polynomial, rbf or linear.\n    power: int\n        The degree of the polynomial kernel. Will be ignored by the other\n        kernel functions.\n    gamma: float\n        Used in the rbf kernel function.\n    coef: float\n        Bias term used in the polynomial kernel function.\n    \"\"\"\n    def __init__(self, C=1, kernel=rbf_kernel, power=4, gamma=None, coef=4):\n        self.C = C\n        self.kernel = kernel\n        self.power = power\n        self.gamma = gamma\n        self.coef = coef\n        self.lagr_multipliers = None\n        self.support_vectors = None\n        self.support_vector_labels = None\n        self.intercept = None\n\n    def fit(self, X, y):\n\n        n_samples, n_features = np.shape(X)\n\n        # Set gamma to 1/n_features by default\n        if not self.gamma:\n            self.gamma = 1 / n_features\n\n        # Initialize kernel method with parameters\n        self.kernel = self.kernel(\n            power=self.power,\n            gamma=self.gamma,\n            coef=self.coef)\n\n        # Calculate kernel matrix\n        kernel_matrix = np.zeros((n_samples, n_samples))\n        for i in range(n_samples):\n            for j in range(n_samples):\n                kernel_matrix[i, j] = self.kernel(X[i], X[j])\n\n        # Define the quadratic optimization problem\n        P = cvxopt.matrix(np.outer(y, y) * kernel_matrix, tc='d')\n        q = cvxopt.matrix(np.ones(n_samples) * -1)\n        A = cvxopt.matrix(y, (1, n_samples), tc='d')\n        b = cvxopt.matrix(0, tc='d')\n\n        if not self.C:\n            G = cvxopt.matrix(np.identity(n_samples) * -1)\n            h = cvxopt.matrix(np.zeros(n_samples))\n        else:\n            G_max = np.identity(n_samples) * -1\n            G_min = np.identity(n_samples)\n            G = cvxopt.matrix(np.vstack((G_max, G_min)))\n            h_max = cvxopt.matrix(np.zeros(n_samples))\n            h_min = cvxopt.matrix(np.ones(n_samples) * self.C)\n            h = cvxopt.matrix(np.vstack((h_max, h_min)))\n\n        # Solve the quadratic optimization problem using cvxopt\n        minimization = cvxopt.solvers.qp(P, q, G, h, A, b)\n\n        # Lagrange multipliers\n        lagr_mult = np.ravel(minimization['x'])\n\n        # Extract support vectors\n        # Get indexes of non-zero lagr. multipiers\n        idx = lagr_mult > 1e-7\n        # Get the corresponding lagr. multipliers\n        self.lagr_multipliers = lagr_mult[idx]\n        # Get the samples that will act as support vectors\n        self.support_vectors = X[idx]\n        # Get the corresponding labels\n        self.support_vector_labels = y[idx]\n\n        # Calculate intercept with first support vector\n        self.intercept = self.support_vector_labels[0]\n        for i in range(len(self.lagr_multipliers)):\n            self.intercept -= self.lagr_multipliers[i] * self.support_vector_labels[\n                i] * self.kernel(self.support_vectors[i], self.support_vectors[0])\n\n    def predict(self, X):\n        y_pred = []\n        # Iterate through list of samples and make predictions\n        for sample in X:\n            prediction = 0\n            # Determine the label of the sample by the support vectors\n            for i in range(len(self.lagr_multipliers)):\n                prediction += self.lagr_multipliers[i] * self.support_vector_labels[\n                    i] * self.kernel(self.support_vectors[i], sample)\n            prediction += self.intercept\n            y_pred.append(np.sign(prediction))\n        return np.array(y_pred)\n"
  },
  {
    "path": "mlfromscratch/supervised_learning/xgboost.py",
    "content": "from __future__ import division, print_function\nimport numpy as np\nimport progressbar\n\nfrom mlfromscratch.utils import train_test_split, standardize, to_categorical, normalize\nfrom mlfromscratch.utils import mean_squared_error, accuracy_score\nfrom mlfromscratch.supervised_learning import XGBoostRegressionTree\nfrom mlfromscratch.deep_learning.activation_functions import Sigmoid\nfrom mlfromscratch.utils.misc import bar_widgets\nfrom mlfromscratch.utils import Plot\n\n\nclass LogisticLoss():\n    def __init__(self):\n        sigmoid = Sigmoid()\n        self.log_func = sigmoid\n        self.log_grad = sigmoid.gradient\n\n    def loss(self, y, y_pred):\n        y_pred = np.clip(y_pred, 1e-15, 1 - 1e-15)\n        p = self.log_func(y_pred)\n        return y * np.log(p) + (1 - y) * np.log(1 - p)\n\n    # gradient w.r.t y_pred\n    def gradient(self, y, y_pred):\n        p = self.log_func(y_pred)\n        return -(y - p)\n\n    # w.r.t y_pred\n    def hess(self, y, y_pred):\n        p = self.log_func(y_pred)\n        return p * (1 - p)\n\n\nclass XGBoost(object):\n    \"\"\"The XGBoost classifier.\n\n    Reference: http://xgboost.readthedocs.io/en/latest/model.html\n\n    Parameters:\n    -----------\n    n_estimators: int\n        The number of classification trees that are used.\n    learning_rate: float\n        The step length that will be taken when following the negative gradient during\n        training.\n    min_samples_split: int\n        The minimum number of samples needed to make a split when building a tree.\n    min_impurity: float\n        The minimum impurity required to split the tree further. \n    max_depth: int\n        The maximum depth of a tree.\n    \"\"\"\n    def __init__(self, n_estimators=200, learning_rate=0.001, min_samples_split=2,\n                 min_impurity=1e-7, max_depth=2):\n        self.n_estimators = n_estimators            # Number of trees\n        self.learning_rate = learning_rate          # Step size for weight update\n        self.min_samples_split = min_samples_split  # The minimum n of sampels to justify split\n        self.min_impurity = min_impurity              # Minimum variance reduction to continue\n        self.max_depth = max_depth                  # Maximum depth for tree\n\n        self.bar = progressbar.ProgressBar(widgets=bar_widgets)\n        \n        # Log loss for classification\n        self.loss = LogisticLoss()\n\n        # Initialize regression trees\n        self.trees = []\n        for _ in range(n_estimators):\n            tree = XGBoostRegressionTree(\n                    min_samples_split=self.min_samples_split,\n                    min_impurity=min_impurity,\n                    max_depth=self.max_depth,\n                    loss=self.loss)\n\n            self.trees.append(tree)\n\n    def fit(self, X, y):\n        y = to_categorical(y)\n\n        y_pred = np.zeros(np.shape(y))\n        for i in self.bar(range(self.n_estimators)):\n            tree = self.trees[i]\n            y_and_pred = np.concatenate((y, y_pred), axis=1)\n            tree.fit(X, y_and_pred)\n            update_pred = tree.predict(X)\n\n            y_pred -= np.multiply(self.learning_rate, update_pred)\n\n    def predict(self, X):\n        y_pred = None\n        # Make predictions\n        for tree in self.trees:\n            # Estimate gradient and update prediction\n            update_pred = tree.predict(X)\n            if y_pred is None:\n                y_pred = np.zeros_like(update_pred)\n            y_pred -= np.multiply(self.learning_rate, update_pred)\n\n        # Turn into probability distribution (Softmax)\n        y_pred = np.exp(y_pred) / np.sum(np.exp(y_pred), axis=1, keepdims=True)\n        # Set label to the value that maximizes probability\n        y_pred = np.argmax(y_pred, axis=1)\n        return y_pred\n"
  },
  {
    "path": "mlfromscratch/unsupervised_learning/__init__.py",
    "content": "from .principal_component_analysis import PCA\nfrom .apriori import Apriori\nfrom .dbscan import DBSCAN\nfrom .fp_growth import FPGrowth\nfrom .gaussian_mixture_model import GaussianMixtureModel\nfrom .genetic_algorithm import GeneticAlgorithm\nfrom .k_means import KMeans\nfrom .partitioning_around_medoids import PAM\nfrom .restricted_boltzmann_machine import RBM\n"
  },
  {
    "path": "mlfromscratch/unsupervised_learning/apriori.py",
    "content": "from __future__ import division, print_function\nimport numpy as np\nimport itertools\n\n\nclass Rule():\n    def __init__(self, antecedent, concequent, confidence, support):\n        self.antecedent = antecedent\n        self.concequent = concequent\n        self.confidence = confidence\n        self.support = support\n\n\nclass Apriori():\n    \"\"\"A method for determining frequent itemsets in a transactional database and\n    also for generating rules for those itemsets. \n\n    Parameters:\n    -----------\n    min_sup: float\n        The minimum fraction of transactions an itemets needs to\n        occur in to be deemed frequent\n    min_conf: float:\n        The minimum fraction of times the antecedent needs to imply\n        the concequent to justify rule\n    \"\"\"\n    def __init__(self, min_sup=0.3, min_conf=0.81):\n\n        self.min_sup = min_sup\n        self.min_conf = min_conf\n        self.freq_itemsets = None       # List of freqeuent itemsets\n        self.transactions = None        # List of transactions\n\n    def _calculate_support(self, itemset):\n        count = 0\n        for transaction in self.transactions:\n            if self._transaction_contains_items(transaction, itemset):\n                count += 1\n        support = count / len(self.transactions)\n        return support\n\n\n    def _get_frequent_itemsets(self, candidates):\n        \"\"\" Prunes the candidates that are not frequent => returns list with \n        only frequent itemsets \"\"\"\n        frequent = []\n        # Find frequent items\n        for itemset in candidates:\n            support = self._calculate_support(itemset)\n            if support >= self.min_sup:\n                frequent.append(itemset)\n        return frequent\n\n\n    def _has_infrequent_itemsets(self, candidate):\n        \"\"\" True or false depending on the candidate has any\n        subset with size k - 1 that is not in the frequent itemset \"\"\"\n        k = len(candidate)\n        # Find all combinations of size k-1 in candidate\n        # E.g [1,2,3] => [[1,2],[1,3],[2,3]]\n        subsets = list(itertools.combinations(candidate, k - 1))\n        for t in subsets:\n            # t - is tuple. If size == 1 get the element\n            subset = list(t) if len(t) > 1 else t[0]\n            if not subset in self.freq_itemsets[-1]:\n                return True\n        return False\n\n\n    def _generate_candidates(self, freq_itemset):\n        \"\"\" Joins the elements in the frequent itemset and prunes\n        resulting sets if they contain subsets that have been determined\n        to be infrequent. \"\"\"\n        candidates = []\n        for itemset1 in freq_itemset:\n            for itemset2 in freq_itemset:\n                # Valid if every element but the last are the same\n                # and the last element in itemset1 is smaller than the last\n                # in itemset2\n                valid = False\n                single_item = isinstance(itemset1, int)\n                if single_item and itemset1 < itemset2:\n                    valid = True\n                elif not single_item and np.array_equal(itemset1[:-1], itemset2[:-1]) and itemset1[-1] < itemset2[-1]:\n                    valid = True\n\n                if valid:\n                    # JOIN: Add the last element in itemset2 to itemset1 to\n                    # create a new candidate\n                    if single_item:\n                        candidate = [itemset1, itemset2]\n                    else:\n                        candidate = itemset1 + [itemset2[-1]]\n                    # PRUNE: Check if any subset of candidate have been determined\n                    # to be infrequent\n                    infrequent = self._has_infrequent_itemsets(candidate)\n                    if not infrequent:\n                        candidates.append(candidate)\n        return candidates\n\n\n    def _transaction_contains_items(self, transaction, items):\n        \"\"\" True or false depending on each item in the itemset is\n        in the transaction \"\"\"\n        # If items is in fact only one item\n        if isinstance(items, int):\n            return items in transaction\n        # Iterate through list of items and make sure that\n        # all items are in the transaction\n        for item in items:\n            if not item in transaction:\n                return False\n        return True\n\n    def find_frequent_itemsets(self, transactions):\n        \"\"\" Returns the set of frequent itemsets in the list of transactions \"\"\"\n        self.transactions = transactions\n        # Get all unique items in the transactions\n        unique_items = set(item for transaction in self.transactions for item in transaction)\n        # Get the frequent items\n        self.freq_itemsets = [self._get_frequent_itemsets(unique_items)]\n        while(True):\n            # Generate new candidates from last added frequent itemsets\n            candidates = self._generate_candidates(self.freq_itemsets[-1])\n            # Get the frequent itemsets among those candidates\n            frequent_itemsets = self._get_frequent_itemsets(candidates)\n\n            # If there are no frequent itemsets we're done\n            if not frequent_itemsets:\n                break\n\n            # Add them to the total list of frequent itemsets and start over\n            self.freq_itemsets.append(frequent_itemsets)\n\n        # Flatten the array and return every frequent itemset\n        frequent_itemsets = [\n            itemset for sublist in self.freq_itemsets for itemset in sublist]\n        return frequent_itemsets\n\n\n    def _rules_from_itemset(self, initial_itemset, itemset):\n        \"\"\" Recursive function which returns the rules where confidence >= min_confidence\n        Starts with large itemset and recursively explores rules for subsets \"\"\"\n        rules = []\n        k = len(itemset)\n        # Get all combinations of sub-itemsets of size k - 1 from itemset\n        # E.g [1,2,3] => [[1,2],[1,3],[2,3]]\n        subsets = list(itertools.combinations(itemset, k - 1))\n        support = self._calculate_support(initial_itemset)\n        for antecedent in subsets:\n            # itertools.combinations returns tuples => convert to list\n            antecedent = list(antecedent)\n            antecedent_support = self._calculate_support(antecedent)\n            # Calculate the confidence as sup(A and B) / sup(B), if antecedent\n            # is B in an itemset of A and B\n            confidence = float(\"{0:.2f}\".format(support / antecedent_support))\n            if confidence >= self.min_conf:\n                # The concequent is the initial_itemset except for antecedent\n                concequent = [itemset for itemset in initial_itemset if not itemset in antecedent]\n                # If single item => get item\n                if len(antecedent) == 1:\n                    antecedent = antecedent[0]\n                if len(concequent) == 1:\n                    concequent = concequent[0]\n                # Create new rule\n                rule = Rule(\n                        antecedent=antecedent,\n                        concequent=concequent,\n                        confidence=confidence,\n                        support=support)\n                rules.append(rule)\n                    \n                # If there are subsets that could result in rules\n                # recursively add rules from subsets\n                if k - 1 > 1:\n                    rules += self._rules_from_itemset(initial_itemset, antecedent)\n        return rules\n\n    def generate_rules(self, transactions):\n        self.transactions = transactions\n        frequent_itemsets = self.find_frequent_itemsets(transactions)\n        # Only consider itemsets of size >= 2 items\n        frequent_itemsets = [itemset for itemset in frequent_itemsets if not isinstance(\n                itemset, int)]\n        rules = []\n        for itemset in frequent_itemsets:\n            rules += self._rules_from_itemset(itemset, itemset)\n        # Remove empty values\n        return rules\n\n"
  },
  {
    "path": "mlfromscratch/unsupervised_learning/autoencoder.py",
    "content": "from __future__ import print_function, division\nfrom sklearn import datasets\nimport math\nimport matplotlib.pyplot as plt\nimport numpy as np\nimport progressbar\n\nfrom sklearn.datasets import fetch_mldata\n\nfrom mlfromscratch.deep_learning.optimizers import Adam\nfrom mlfromscratch.deep_learning.loss_functions import CrossEntropy, SquareLoss\nfrom mlfromscratch.deep_learning.layers import Dense, Dropout, Flatten, Activation, Reshape, BatchNormalization\nfrom mlfromscratch.deep_learning import NeuralNetwork\n\n\nclass Autoencoder():\n    \"\"\"An Autoencoder with deep fully-connected neural nets.\n\n    Training Data: MNIST Handwritten Digits (28x28 images)\n    \"\"\"\n    def __init__(self):\n        self.img_rows = 28\n        self.img_cols = 28\n        self.img_dim = self.img_rows * self.img_cols\n        self.latent_dim = 128 # The dimension of the data embedding\n\n        optimizer = Adam(learning_rate=0.0002, b1=0.5)\n        loss_function = SquareLoss\n\n        self.encoder = self.build_encoder(optimizer, loss_function)\n        self.decoder = self.build_decoder(optimizer, loss_function)\n\n        self.autoencoder = NeuralNetwork(optimizer=optimizer, loss=loss_function)\n        self.autoencoder.layers.extend(self.encoder.layers)\n        self.autoencoder.layers.extend(self.decoder.layers)\n\n        print ()\n        self.autoencoder.summary(name=\"Variational Autoencoder\")\n\n    def build_encoder(self, optimizer, loss_function):\n\n        encoder = NeuralNetwork(optimizer=optimizer, loss=loss_function)\n        encoder.add(Dense(512, input_shape=(self.img_dim,)))\n        encoder.add(Activation('leaky_relu'))\n        encoder.add(BatchNormalization(momentum=0.8))\n        encoder.add(Dense(256))\n        encoder.add(Activation('leaky_relu'))\n        encoder.add(BatchNormalization(momentum=0.8))\n        encoder.add(Dense(self.latent_dim))\n\n        return encoder\n\n    def build_decoder(self, optimizer, loss_function):\n\n        decoder = NeuralNetwork(optimizer=optimizer, loss=loss_function)\n        decoder.add(Dense(256, input_shape=(self.latent_dim,)))\n        decoder.add(Activation('leaky_relu'))\n        decoder.add(BatchNormalization(momentum=0.8))\n        decoder.add(Dense(512))\n        decoder.add(Activation('leaky_relu'))\n        decoder.add(BatchNormalization(momentum=0.8))\n        decoder.add(Dense(self.img_dim))\n        decoder.add(Activation('tanh'))\n\n        return decoder\n\n    def train(self, n_epochs, batch_size=128, save_interval=50):\n\n        mnist = fetch_mldata('MNIST original')\n\n        X = mnist.data\n        y = mnist.target\n\n        # Rescale [-1, 1]\n        X = (X.astype(np.float32) - 127.5) / 127.5\n\n        for epoch in range(n_epochs):\n\n            # Select a random half batch of images\n            idx = np.random.randint(0, X.shape[0], batch_size)\n            imgs = X[idx]\n\n            # Train the Autoencoder\n            loss, _ = self.autoencoder.train_on_batch(imgs, imgs)\n\n            # Display the progress\n            print (\"%d [D loss: %f]\" % (epoch, loss))\n\n            # If at save interval => save generated image samples\n            if epoch % save_interval == 0:\n                self.save_imgs(epoch, X)\n\n    def save_imgs(self, epoch, X):\n        r, c = 5, 5 # Grid size\n        # Select a random half batch of images\n        idx = np.random.randint(0, X.shape[0], r*c)\n        imgs = X[idx]\n        # Generate images and reshape to image shape\n        gen_imgs = self.autoencoder.predict(imgs).reshape((-1, self.img_rows, self.img_cols))\n\n        # Rescale images 0 - 1\n        gen_imgs = 0.5 * gen_imgs + 0.5\n\n        fig, axs = plt.subplots(r, c)\n        plt.suptitle(\"Autoencoder\")\n        cnt = 0\n        for i in range(r):\n            for j in range(c):\n                axs[i,j].imshow(gen_imgs[cnt,:,:], cmap='gray')\n                axs[i,j].axis('off')\n                cnt += 1\n        fig.savefig(\"ae_%d.png\" % epoch)\n        plt.close()\n\n\nif __name__ == '__main__':\n    ae = Autoencoder()\n    ae.train(n_epochs=200000, batch_size=64, save_interval=400)\n"
  },
  {
    "path": "mlfromscratch/unsupervised_learning/dbscan.py",
    "content": "from __future__ import print_function, division\nimport numpy as np\nfrom mlfromscratch.utils import Plot, euclidean_distance, normalize\n\n\nclass DBSCAN():\n    \"\"\"A density based clustering method that expands clusters from \n    samples that have more neighbors within a radius specified by eps\n    than the value min_samples.\n\n    Parameters:\n    -----------\n    eps: float\n        The radius within which samples are considered neighbors\n    min_samples: int\n        The number of neighbors required for the sample to be a core point. \n    \"\"\"\n    def __init__(self, eps=1, min_samples=5):\n        self.eps = eps\n        self.min_samples = min_samples\n\n    def _get_neighbors(self, sample_i):\n        \"\"\" Return a list of indexes of neighboring samples\n        A sample_2 is considered a neighbor of sample_1 if the distance between\n        them is smaller than epsilon \"\"\"\n        neighbors = []\n        idxs = np.arange(len(self.X))\n        for i, _sample in enumerate(self.X[idxs != sample_i]):\n            distance = euclidean_distance(self.X[sample_i], _sample)\n            if distance < self.eps:\n                neighbors.append(i)\n        return np.array(neighbors)\n\n    def _expand_cluster(self, sample_i, neighbors):\n        \"\"\" Recursive method which expands the cluster until we have reached the border\n        of the dense area (density determined by eps and min_samples) \"\"\"\n        cluster = [sample_i]\n        # Iterate through neighbors\n        for neighbor_i in neighbors:\n            if not neighbor_i in self.visited_samples:\n                self.visited_samples.append(neighbor_i)\n                # Fetch the sample's distant neighbors (neighbors of neighbor)\n                self.neighbors[neighbor_i] = self._get_neighbors(neighbor_i)\n                # Make sure the neighbor's neighbors are more than min_samples\n                # (If this is true the neighbor is a core point)\n                if len(self.neighbors[neighbor_i]) >= self.min_samples:\n                    # Expand the cluster from the neighbor\n                    expanded_cluster = self._expand_cluster(\n                        neighbor_i, self.neighbors[neighbor_i])\n                    # Add expanded cluster to this cluster\n                    cluster = cluster + expanded_cluster\n                else:\n                    # If the neighbor is not a core point we only add the neighbor point\n                    cluster.append(neighbor_i)\n        return cluster\n\n    def _get_cluster_labels(self):\n        \"\"\" Return the samples labels as the index of the cluster in which they are\n        contained \"\"\"\n        # Set default value to number of clusters\n        # Will make sure all outliers have same cluster label\n        labels = np.full(shape=self.X.shape[0], fill_value=len(self.clusters))\n        for cluster_i, cluster in enumerate(self.clusters):\n            for sample_i in cluster:\n                labels[sample_i] = cluster_i\n        return labels\n\n    # DBSCAN\n    def predict(self, X):\n        self.X = X\n        self.clusters = []\n        self.visited_samples = []\n        self.neighbors = {}\n        n_samples = np.shape(self.X)[0]\n        # Iterate through samples and expand clusters from them\n        # if they have more neighbors than self.min_samples\n        for sample_i in range(n_samples):\n            if sample_i in self.visited_samples:\n                continue\n            self.neighbors[sample_i] = self._get_neighbors(sample_i)\n            if len(self.neighbors[sample_i]) >= self.min_samples:\n                # If core point => mark as visited\n                self.visited_samples.append(sample_i)\n                # Sample has more neighbors than self.min_samples => expand\n                # cluster from sample\n                new_cluster = self._expand_cluster(\n                    sample_i, self.neighbors[sample_i])\n                # Add cluster to list of clusters\n                self.clusters.append(new_cluster)\n\n        # Get the resulting cluster labels\n        cluster_labels = self._get_cluster_labels()\n        return cluster_labels\n"
  },
  {
    "path": "mlfromscratch/unsupervised_learning/dcgan.py",
    "content": "from __future__ import print_function, division\nimport matplotlib.pyplot as plt\nimport numpy as np\nimport progressbar\nfrom sklearn.datasets import fetch_mldata\n\nfrom mlfromscratch.deep_learning.optimizers import Adam\nfrom mlfromscratch.deep_learning.loss_functions import CrossEntropy\nfrom mlfromscratch.deep_learning.layers import Dense, Dropout, Flatten, Activation, Reshape, BatchNormalization, ZeroPadding2D, Conv2D, UpSampling2D\nfrom mlfromscratch.deep_learning import NeuralNetwork\n\n\nclass DCGAN():\n    def __init__(self):\n        self.img_rows = 28 \n        self.img_cols = 28\n        self.channels = 1\n        self.img_shape = (self.channels, self.img_rows, self.img_cols)\n        self.latent_dim = 100\n\n        optimizer = Adam(learning_rate=0.0002, b1=0.5)\n        loss_function = CrossEntropy\n\n        # Build the discriminator\n        self.discriminator = self.build_discriminator(optimizer, loss_function)\n\n        # Build the generator\n        self.generator = self.build_generator(optimizer, loss_function)\n\n        # Build the combined model\n        self.combined = NeuralNetwork(optimizer=optimizer, loss=loss_function)\n        self.combined.layers.extend(self.generator.layers)\n        self.combined.layers.extend(self.discriminator.layers)\n\n        print ()\n        self.generator.summary(name=\"Generator\")\n        self.discriminator.summary(name=\"Discriminator\")\n\n    def build_generator(self, optimizer, loss_function):\n        \n        model = NeuralNetwork(optimizer=optimizer, loss=loss_function)\n\n        model.add(Dense(128 * 7 * 7, input_shape=(100,)))\n        model.add(Activation('leaky_relu'))\n        model.add(Reshape((128, 7, 7)))\n        model.add(BatchNormalization(momentum=0.8))\n        model.add(UpSampling2D())\n        model.add(Conv2D(128, filter_shape=(3,3), padding='same'))\n        model.add(Activation(\"leaky_relu\"))\n        model.add(BatchNormalization(momentum=0.8))\n        model.add(UpSampling2D())\n        model.add(Conv2D(64, filter_shape=(3,3), padding='same'))\n        model.add(Activation(\"leaky_relu\"))\n        model.add(BatchNormalization(momentum=0.8))\n        model.add(Conv2D(1, filter_shape=(3,3), padding='same'))\n        model.add(Activation(\"tanh\"))\n\n        return model\n\n    def build_discriminator(self, optimizer, loss_function):\n        \n        model = NeuralNetwork(optimizer=optimizer, loss=loss_function)\n\n        model.add(Conv2D(32, filter_shape=(3,3), stride=2, input_shape=self.img_shape, padding='same'))\n        model.add(Activation('leaky_relu'))\n        model.add(Dropout(0.25))\n        model.add(Conv2D(64, filter_shape=(3,3), stride=2, padding='same'))\n        model.add(ZeroPadding2D(padding=((0,1),(0,1))))\n        model.add(Activation('leaky_relu'))\n        model.add(Dropout(0.25))\n        model.add(BatchNormalization(momentum=0.8))\n        model.add(Conv2D(128, filter_shape=(3,3), stride=2, padding='same'))\n        model.add(Activation('leaky_relu'))\n        model.add(Dropout(0.25))\n        model.add(BatchNormalization(momentum=0.8))\n        model.add(Conv2D(256, filter_shape=(3,3), stride=1, padding='same'))\n        model.add(Activation('leaky_relu'))\n        model.add(Dropout(0.25))\n        model.add(Flatten())\n        model.add(Dense(2))\n        model.add(Activation('softmax'))\n\n        return model\n\n\n    def train(self, epochs, batch_size=128, save_interval=50):\n\n        mnist = fetch_mldata('MNIST original')\n\n        X = mnist.data.reshape((-1,) + self.img_shape)\n        y = mnist.target\n\n        # Rescale -1 to 1\n        X = (X.astype(np.float32) - 127.5) / 127.5\n\n        half_batch = int(batch_size / 2)\n\n        for epoch in range(epochs):\n\n            # ---------------------\n            #  Train Discriminator\n            # ---------------------\n\n            self.discriminator.set_trainable(True)\n\n            # Select a random half batch of images\n            idx = np.random.randint(0, X.shape[0], half_batch)\n            imgs = X[idx]\n\n            # Sample noise to use as generator input\n            noise = np.random.normal(0, 1, (half_batch, 100))\n\n            # Generate a half batch of images\n            gen_imgs = self.generator.predict(noise)\n\n            valid = np.concatenate((np.ones((half_batch, 1)), np.zeros((half_batch, 1))), axis=1)\n            fake = np.concatenate((np.zeros((half_batch, 1)), np.ones((half_batch, 1))), axis=1)\n\n            # Train the discriminator\n            d_loss_real, d_acc_real = self.discriminator.train_on_batch(imgs, valid)\n            d_loss_fake, d_acc_fake = self.discriminator.train_on_batch(gen_imgs, fake)\n            d_loss = 0.5 * (d_loss_real + d_loss_fake)\n            d_acc = 0.5 * (d_acc_real + d_acc_fake)\n\n\n            # ---------------------\n            #  Train Generator\n            # ---------------------\n\n            # We only want to train the generator for the combined model\n            self.discriminator.set_trainable(False)\n\n            # Sample noise and use as generator input\n            noise = np.random.normal(0, 1, (batch_size, self.latent_dim))\n\n            # The generator wants the discriminator to label the generated samples as valid\n            valid = np.concatenate((np.ones((batch_size, 1)), np.zeros((batch_size, 1))), axis=1)\n\n            # Train the generator\n            g_loss, g_acc = self.combined.train_on_batch(noise, valid)\n\n            # Display the progress\n            print (\"%d [D loss: %f, acc: %.2f%%] [G loss: %f, acc: %.2f%%]\" % (epoch, d_loss, 100*d_acc, g_loss, 100*g_acc))\n\n            # If at save interval => save generated image samples\n            if epoch % save_interval == 0:\n                self.save_imgs(epoch)\n\n    def save_imgs(self, epoch):\n        r, c = 5, 5\n        noise = np.random.normal(0, 1, (r * c, 100))\n        gen_imgs = self.generator.predict(noise)\n\n        # Rescale images 0 - 1 (from -1 to 1)\n        gen_imgs = 0.5 * (gen_imgs + 1)\n\n        fig, axs = plt.subplots(r, c)\n        plt.suptitle(\"Deep Convolutional Generative Adversarial Network\")\n        cnt = 0\n        for i in range(r):\n            for j in range(c):\n                axs[i,j].imshow(gen_imgs[cnt,0,:,:], cmap='gray')\n                axs[i,j].axis('off')\n                cnt += 1\n        fig.savefig(\"mnist_%d.png\" % epoch)\n        plt.close()\n\n\nif __name__ == '__main__':\n    dcgan = DCGAN()\n    dcgan.train(epochs=200000, batch_size=64, save_interval=50)\n\n\n"
  },
  {
    "path": "mlfromscratch/unsupervised_learning/fp_growth.py",
    "content": "from __future__ import division, print_function\nimport numpy as np\nimport itertools\n\n\nclass FPTreeNode():\n    def __init__(self, item=None, support=1):\n        # 'Value' of the item\n        self.item = item\n        # Number of times the item occurs in a\n        # transaction\n        self.support = support\n        # Child nodes in the FP Growth Tree\n        self.children = {}\n\n\nclass FPGrowth():\n    \"\"\"A method for determining frequent itemsets in a transactional database. \n    This is done by building a so called FP Growth tree, which can then be mined\n    to collect the frequent itemsets. More effective than Apriori for large transactional\n    databases.\n\n    Parameters:\n    -----------\n    min_sup: float\n        The minimum fraction of transactions an itemets needs to\n        occur in to be deemed frequent\n    \"\"\"\n    def __init__(self, min_sup=0.3):\n        self.min_sup = min_sup\n        # The root of the initial FP Growth Tree\n        self.tree_root = None\n        # Prefixes of itemsets in the FP Growth Tree\n        self.prefixes = {}\n        self.frequent_itemsets = []\n\n    # Count the number of transactions that contains item.\n    def _calculate_support(self, item, transactions):\n        count = 0\n        for transaction in transactions:\n            if item in transaction:\n                count += 1\n        support = count\n        return support\n\n\n    def _get_frequent_items(self, transactions):\n        \"\"\" Returns a set of frequent items. An item is determined to\n        be frequent if there are atleast min_sup transactions that contains\n        it. \"\"\"\n        # Get all unique items in the transactions\n        unique_items = set(\n            item for transaction in transactions for item in transaction)\n        items = []\n        for item in unique_items:\n            sup = self._calculate_support(item, transactions)\n            if sup >= self.min_sup:\n                items.append([item, sup])\n        # Sort by support - Highest to lowest\n        items.sort(key=lambda item: item[1], reverse=True)\n        frequent_items = [[el[0]] for el in items]\n        # Only return the items\n        return frequent_items\n\n    def _insert_tree(self, node, children):\n        \"\"\" Recursive method which adds nodes to the tree. \"\"\"\n        if not children:\n            return\n        # Create new node as the first item in children list\n        child_item = children[0]\n        child = FPTreeNode(item=child_item)\n        # If parent already contains item => increase the support\n        if child_item in node.children:\n            node.children[child.item].support += 1\n        else:\n            node.children[child.item] = child\n\n        # Execute _insert_tree on the rest of the children list\n        # from the new node\n        self._insert_tree(node.children[child.item], children[1:])\n\n    def _construct_tree(self, transactions, frequent_items=None):\n        if not frequent_items:\n            # Get frequent items sorted by support\n            frequent_items = self._get_frequent_items(transactions)\n        unique_frequent_items = list(\n            set(item for itemset in frequent_items for item in itemset))\n        # Construct the root of the FP Growth tree\n        root = FPTreeNode()\n        for transaction in transactions:\n            # Remove items that are not frequent according to\n            # unique_frequent_items\n            transaction = [item for item in transaction if item in unique_frequent_items]\n            transaction.sort(key=lambda item: frequent_items.index([item]))\n            self._insert_tree(root, transaction)\n\n        return root\n\n    def print_tree(self, node=None, indent_times=0):\n        \"\"\" Recursive method which prints the FP Growth Tree \"\"\"\n        if not node:\n            node = self.tree_root\n        indent = \"    \" * indent_times\n        print (\"%s%s:%s\" % (indent, node.item, node.support))\n        for child_key in node.children:\n            child = node.children[child_key]\n            self.print_tree(child, indent_times + 1)\n\n\n    def _is_prefix(self, itemset, node):\n        \"\"\" Makes sure that the first item in itemset is a child of node \n        and that every following item in itemset is reachable via that path \"\"\"\n        for item in itemset:\n            if not item in node.children:\n                return False\n            node = node.children[item]\n        return True\n\n\n    def _determine_prefixes(self, itemset, node, prefixes=None):\n        \"\"\" Recursive method that adds prefixes to the itemset by traversing the \n        FP Growth Tree\"\"\"\n        if not prefixes:\n            prefixes = []\n\n        # If the current node is a prefix to the itemset\n        # add the current prefixes value as prefix to the itemset\n        if self._is_prefix(itemset, node):\n            itemset_key = self._get_itemset_key(itemset)\n            if not itemset_key in self.prefixes:\n                self.prefixes[itemset_key] = []\n            self.prefixes[itemset_key] += [{\"prefix\": prefixes, \"support\": node.children[itemset[0]].support}]\n\n        for child_key in node.children:\n            child = node.children[child_key]\n            # Recursive call with child as new node. Add the child item as potential\n            # prefix.\n            self._determine_prefixes(itemset, child, prefixes + [child.item])\n\n\n    def _get_itemset_key(self, itemset):\n        \"\"\" Determines the look of the hashmap key for self.prefixes\n        List of more strings than one gets joined by '-' \"\"\"\n        if len(itemset) > 1:\n            itemset_key = \"-\".join(itemset)\n        else:\n            itemset_key = str(itemset[0])\n        return itemset_key\n\n    def _determine_frequent_itemsets(self, conditional_database, suffix):\n        # Calculate new frequent items from the conditional database\n        # of suffix\n        frequent_items = self._get_frequent_items(conditional_database)\n\n        cond_tree = None\n\n        if suffix:\n            cond_tree = self._construct_tree(conditional_database, frequent_items)\n            # Output new frequent itemset as the suffix added to the frequent\n            # items\n            self.frequent_itemsets += [el + suffix for el in frequent_items]\n\n        # Find larger frequent itemset by finding prefixes\n        # of the frequent items in the FP Growth Tree for the conditional\n        # database.\n        self.prefixes = {}\n        for itemset in frequent_items:\n            # If no suffix (first run)\n            if not cond_tree:\n                cond_tree = self.tree_root\n            # Determine prefixes to itemset\n            self._determine_prefixes(itemset, cond_tree)\n            conditional_database = []\n            itemset_key = self._get_itemset_key(itemset)\n            # Build new conditional database\n            if itemset_key in self.prefixes:\n                for el in self.prefixes[itemset_key]:\n                    # If support = 4 => add 4 of the corresponding prefix set\n                    for _ in range(el[\"support\"]):\n                        conditional_database.append(el[\"prefix\"])\n                # Create new suffix\n                new_suffix = itemset + suffix if suffix else itemset\n                self._determine_frequent_itemsets(conditional_database, suffix=new_suffix)\n\n    def find_frequent_itemsets(self, transactions, suffix=None, show_tree=False):\n        self.transactions = transactions\n\n        # Build the FP Growth Tree\n        self.tree_root = self._construct_tree(transactions)\n        if show_tree:\n            print (\"FP-Growth Tree:\")\n            self.print_tree(self.tree_root)\n\n        self._determine_frequent_itemsets(transactions, suffix=None)\n\n        return self.frequent_itemsets\n\n"
  },
  {
    "path": "mlfromscratch/unsupervised_learning/gaussian_mixture_model.py",
    "content": "from __future__ import division, print_function\nimport math\nfrom sklearn import datasets\nimport numpy as np\n\nfrom mlfromscratch.utils import normalize, euclidean_distance, calculate_covariance_matrix\nfrom mlfromscratch.utils import Plot\n\n\nclass GaussianMixtureModel():\n    \"\"\"A probabilistic clustering method for determining groupings among data samples.\n\n    Parameters:\n    -----------\n    k: int\n        The number of clusters the algorithm will form.\n    max_iterations: int\n        The number of iterations the algorithm will run for if it does\n        not converge before that. \n    tolerance: float\n        If the difference of the results from one iteration to the next is\n        smaller than this value we will say that the algorithm has converged.\n    \"\"\"\n    def __init__(self, k=2, max_iterations=2000, tolerance=1e-8):\n        self.k = k\n        self.parameters = []\n        self.max_iterations = max_iterations\n        self.tolerance = tolerance\n        self.responsibilities = []\n        self.sample_assignments = None\n        self.responsibility = None\n\n    def _init_random_gaussians(self, X):\n        \"\"\" Initialize gaussian randomly \"\"\"\n        n_samples = np.shape(X)[0]\n        self.priors = (1 / self.k) * np.ones(self.k)\n        for i in range(self.k):\n            params = {}\n            params[\"mean\"] = X[np.random.choice(range(n_samples))]\n            params[\"cov\"] = calculate_covariance_matrix(X)\n            self.parameters.append(params)\n\n    def multivariate_gaussian(self, X, params):\n        \"\"\" Likelihood \"\"\"\n        n_features = np.shape(X)[1]\n        mean = params[\"mean\"]\n        covar = params[\"cov\"]\n        determinant = np.linalg.det(covar)\n        likelihoods = np.zeros(np.shape(X)[0])\n        for i, sample in enumerate(X):\n            d = n_features  # dimension\n            coeff = (1.0 / (math.pow((2.0 * math.pi), d / 2)\n                            * math.sqrt(determinant)))\n            exponent = math.exp(-0.5 * (sample - mean).T.dot(np.linalg.pinv(covar)).dot((sample - mean)))\n            likelihoods[i] = coeff * exponent\n\n        return likelihoods\n\n    def _get_likelihoods(self, X):\n        \"\"\" Calculate the likelihood over all samples \"\"\"\n        n_samples = np.shape(X)[0]\n        likelihoods = np.zeros((n_samples, self.k))\n        for i in range(self.k):\n            likelihoods[\n                :, i] = self.multivariate_gaussian(\n                X, self.parameters[i])\n        return likelihoods\n\n    def _expectation(self, X):\n        \"\"\" Calculate the responsibility \"\"\"\n        # Calculate probabilities of X belonging to the different clusters\n        weighted_likelihoods = self._get_likelihoods(X) * self.priors\n        sum_likelihoods = np.expand_dims(\n            np.sum(weighted_likelihoods, axis=1), axis=1)\n        # Determine responsibility as P(X|y)*P(y)/P(X)\n        self.responsibility = weighted_likelihoods / sum_likelihoods\n        # Assign samples to cluster that has largest probability\n        self.sample_assignments = self.responsibility.argmax(axis=1)\n        # Save value for convergence check\n        self.responsibilities.append(np.max(self.responsibility, axis=1))\n\n    def _maximization(self, X):\n        \"\"\" Update the parameters and priors \"\"\"\n        # Iterate through clusters and recalculate mean and covariance\n        for i in range(self.k):\n            resp = np.expand_dims(self.responsibility[:, i], axis=1)\n            mean = (resp * X).sum(axis=0) / resp.sum()\n            covariance = (X - mean).T.dot((X - mean) * resp) / resp.sum()\n            self.parameters[i][\"mean\"], self.parameters[\n                i][\"cov\"] = mean, covariance\n\n        # Update weights\n        n_samples = np.shape(X)[0]\n        self.priors = self.responsibility.sum(axis=0) / n_samples\n\n    def _converged(self, X):\n        \"\"\" Covergence if || likehood - last_likelihood || < tolerance \"\"\"\n        if len(self.responsibilities) < 2:\n            return False\n        diff = np.linalg.norm(\n            self.responsibilities[-1] - self.responsibilities[-2])\n        # print (\"Likelihood update: %s (tol: %s)\" % (diff, self.tolerance))\n        return diff <= self.tolerance\n\n    def predict(self, X):\n        \"\"\" Run GMM and return the cluster indices \"\"\"\n        # Initialize the gaussians randomly\n        self._init_random_gaussians(X)\n\n        # Run EM until convergence or for max iterations\n        for _ in range(self.max_iterations):\n            self._expectation(X)    # E-step\n            self._maximization(X)   # M-step\n\n            # Check convergence\n            if self._converged(X):\n                break\n\n        # Make new assignments and return them\n        self._expectation(X)\n        return self.sample_assignments\n"
  },
  {
    "path": "mlfromscratch/unsupervised_learning/generative_adversarial_network.py",
    "content": "from __future__ import print_function, division\nfrom sklearn import datasets\nimport math\nimport matplotlib.pyplot as plt\nimport numpy as np\nimport progressbar\n\nfrom sklearn.datasets import fetch_mldata\n\nfrom mlfromscratch.deep_learning.optimizers import Adam\nfrom mlfromscratch.deep_learning.loss_functions import CrossEntropy\nfrom mlfromscratch.deep_learning.layers import Dense, Dropout, Flatten, Activation, Reshape, BatchNormalization\nfrom mlfromscratch.deep_learning import NeuralNetwork\n\n\nclass GAN():\n    \"\"\"A Generative Adversarial Network with deep fully-connected neural nets as\n    Generator and Discriminator.\n\n    Training Data: MNIST Handwritten Digits (28x28 images)\n    \"\"\"\n    def __init__(self):\n        self.img_rows = 28 \n        self.img_cols = 28\n        self.img_dim = self.img_rows * self.img_cols\n        self.latent_dim = 100\n\n        optimizer = Adam(learning_rate=0.0002, b1=0.5)\n        loss_function = CrossEntropy\n\n        # Build the discriminator\n        self.discriminator = self.build_discriminator(optimizer, loss_function)\n\n        # Build the generator\n        self.generator = self.build_generator(optimizer, loss_function)\n\n        # Build the combined model\n        self.combined = NeuralNetwork(optimizer=optimizer, loss=loss_function)\n        self.combined.layers.extend(self.generator.layers)\n        self.combined.layers.extend(self.discriminator.layers)\n\n        print ()\n        self.generator.summary(name=\"Generator\")\n        self.discriminator.summary(name=\"Discriminator\")\n\n    def build_generator(self, optimizer, loss_function):\n        \n        model = NeuralNetwork(optimizer=optimizer, loss=loss_function)\n\n        model.add(Dense(256, input_shape=(self.latent_dim,)))\n        model.add(Activation('leaky_relu'))\n        model.add(BatchNormalization(momentum=0.8))\n        model.add(Dense(512))\n        model.add(Activation('leaky_relu'))\n        model.add(BatchNormalization(momentum=0.8))\n        model.add(Dense(1024))\n        model.add(Activation('leaky_relu'))\n        model.add(BatchNormalization(momentum=0.8))\n        model.add(Dense(self.img_dim))\n        model.add(Activation('tanh'))\n\n        return model\n\n    def build_discriminator(self, optimizer, loss_function):\n        \n        model = NeuralNetwork(optimizer=optimizer, loss=loss_function)\n\n        model.add(Dense(512, input_shape=(self.img_dim,)))\n        model.add(Activation('leaky_relu'))\n        model.add(Dropout(0.5))\n        model.add(Dense(256))\n        model.add(Activation('leaky_relu'))\n        model.add(Dropout(0.5))\n        model.add(Dense(2))\n        model.add(Activation('softmax'))\n\n        return model\n\n    def train(self, n_epochs, batch_size=128, save_interval=50):\n\n        mnist = fetch_mldata('MNIST original')\n\n        X = mnist.data\n        y = mnist.target\n\n        # Rescale [-1, 1]\n        X = (X.astype(np.float32) - 127.5) / 127.5\n\n        half_batch = int(batch_size / 2)\n\n        for epoch in range(n_epochs):\n\n            # ---------------------\n            #  Train Discriminator\n            # ---------------------\n\n            self.discriminator.set_trainable(True)\n\n            # Select a random half batch of images\n            idx = np.random.randint(0, X.shape[0], half_batch)\n            imgs = X[idx]\n\n            # Sample noise to use as generator input\n            noise = np.random.normal(0, 1, (half_batch, self.latent_dim))\n\n            # Generate a half batch of images\n            gen_imgs = self.generator.predict(noise)\n\n            # Valid = [1, 0], Fake = [0, 1]\n            valid = np.concatenate((np.ones((half_batch, 1)), np.zeros((half_batch, 1))), axis=1)\n            fake = np.concatenate((np.zeros((half_batch, 1)), np.ones((half_batch, 1))), axis=1)\n\n            # Train the discriminator\n            d_loss_real, d_acc_real = self.discriminator.train_on_batch(imgs, valid)\n            d_loss_fake, d_acc_fake = self.discriminator.train_on_batch(gen_imgs, fake)\n            d_loss = 0.5 * (d_loss_real + d_loss_fake)\n            d_acc = 0.5 * (d_acc_real + d_acc_fake)\n\n\n            # ---------------------\n            #  Train Generator\n            # ---------------------\n\n            # We only want to train the generator for the combined model\n            self.discriminator.set_trainable(False)\n\n            # Sample noise and use as generator input\n            noise = np.random.normal(0, 1, (batch_size, self.latent_dim))\n\n            # The generator wants the discriminator to label the generated samples as valid\n            valid = np.concatenate((np.ones((batch_size, 1)), np.zeros((batch_size, 1))), axis=1)\n\n            # Train the generator\n            g_loss, g_acc = self.combined.train_on_batch(noise, valid)\n\n            # Display the progress\n            print (\"%d [D loss: %f, acc: %.2f%%] [G loss: %f, acc: %.2f%%]\" % (epoch, d_loss, 100*d_acc, g_loss, 100*g_acc))\n\n            # If at save interval => save generated image samples\n            if epoch % save_interval == 0:\n                self.save_imgs(epoch)\n\n    def save_imgs(self, epoch):\n        r, c = 5, 5 # Grid size\n        noise = np.random.normal(0, 1, (r * c, self.latent_dim))\n        # Generate images and reshape to image shape\n        gen_imgs = self.generator.predict(noise).reshape((-1, self.img_rows, self.img_cols))\n\n        # Rescale images 0 - 1\n        gen_imgs = 0.5 * gen_imgs + 0.5\n\n        fig, axs = plt.subplots(r, c)\n        plt.suptitle(\"Generative Adversarial Network\")\n        cnt = 0\n        for i in range(r):\n            for j in range(c):\n                axs[i,j].imshow(gen_imgs[cnt,:,:], cmap='gray')\n                axs[i,j].axis('off')\n                cnt += 1\n        fig.savefig(\"mnist_%d.png\" % epoch)\n        plt.close()\n\n\nif __name__ == '__main__':\n    gan = GAN()\n    gan.train(n_epochs=200000, batch_size=64, save_interval=400)\n\n\n"
  },
  {
    "path": "mlfromscratch/unsupervised_learning/genetic_algorithm.py",
    "content": "from __future__ import print_function, division\nimport string\nimport numpy as np\n\nclass GeneticAlgorithm():\n    \"\"\"An implementation of a Genetic Algorithm which will try to produce the user\n    specified target string.\n\n    Parameters:\n    -----------\n    target_string: string\n        The string which the GA should try to produce.\n    population_size: int\n        The number of individuals (possible solutions) in the population.\n    mutation_rate: float\n        The rate (or probability) of which the alleles (chars in this case) should be\n        randomly changed.\n    \"\"\"\n    def __init__(self, target_string, population_size, mutation_rate):\n        self.target = target_string\n        self.population_size = population_size\n        self.mutation_rate = mutation_rate\n        self.letters = [\" \"] + list(string.ascii_letters)\n\n    def _initialize(self):\n        \"\"\" Initialize population with random strings \"\"\"\n        self.population = []\n        for _ in range(self.population_size):\n            # Select random letters as new individual\n            individual = \"\".join(np.random.choice(self.letters, size=len(self.target)))\n            self.population.append(individual)\n\n    def _calculate_fitness(self):\n        \"\"\" Calculates the fitness of each individual in the population \"\"\"\n        population_fitness = []\n        for individual in self.population:\n            # Calculate loss as the alphabetical distance between\n            # the characters in the individual and the target string\n            loss = 0\n            for i in range(len(individual)):\n                letter_i1 = self.letters.index(individual[i])\n                letter_i2 = self.letters.index(self.target[i])\n                loss += abs(letter_i1 - letter_i2)\n            fitness = 1 / (loss + 1e-6)\n            population_fitness.append(fitness)\n        return population_fitness\n\n    def _mutate(self, individual):\n        \"\"\" Randomly change the individual's characters with probability\n        self.mutation_rate \"\"\"\n        individual = list(individual)\n        for j in range(len(individual)):\n            # Make change with probability mutation_rate\n            if np.random.random() < self.mutation_rate:\n                individual[j] = np.random.choice(self.letters)\n        # Return mutated individual as string\n        return \"\".join(individual)\n\n    def _crossover(self, parent1, parent2):\n        \"\"\" Create children from parents by crossover \"\"\"\n        # Select random crossover point\n        cross_i = np.random.randint(0, len(parent1))\n        child1 = parent1[:cross_i] + parent2[cross_i:]\n        child2 = parent2[:cross_i] + parent1[cross_i:]\n        return child1, child2\n\n    def run(self, iterations):\n        # Initialize new population\n        self._initialize()\n\n        for epoch in range(iterations):\n            population_fitness = self._calculate_fitness()\n\n            fittest_individual = self.population[np.argmax(population_fitness)]\n            highest_fitness = max(population_fitness)\n\n            # If we have found individual which matches the target => Done\n            if fittest_individual == self.target:\n                break\n\n            # Set the probability that the individual should be selected as a parent\n            # proportionate to the individual's fitness.\n            parent_probabilities = [fitness / sum(population_fitness) for fitness in population_fitness]\n\n            # Determine the next generation\n            new_population = []\n            for i in np.arange(0, self.population_size, 2):\n                # Select two parents randomly according to probabilities\n                parent1, parent2 = np.random.choice(self.population, size=2, p=parent_probabilities, replace=False)\n                # Perform crossover to produce offspring\n                child1, child2 = self._crossover(parent1, parent2)\n                # Save mutated offspring for next generation\n                new_population += [self._mutate(child1), self._mutate(child2)]\n\n            print (\"[%d Closest Candidate: '%s', Fitness: %.2f]\" % (epoch, fittest_individual, highest_fitness))\n            self.population = new_population\n\n        print (\"[%d Answer: '%s']\" % (epoch, fittest_individual))\n\n\n\n\n\n\n"
  },
  {
    "path": "mlfromscratch/unsupervised_learning/k_means.py",
    "content": "from __future__ import print_function, division\nimport numpy as np\nfrom mlfromscratch.utils import normalize, euclidean_distance, Plot\nfrom mlfromscratch.unsupervised_learning import *\n\nclass KMeans():\n    \"\"\"A simple clustering method that forms k clusters by iteratively reassigning\n    samples to the closest centroids and after that moves the centroids to the center\n    of the new formed clusters.\n\n\n    Parameters:\n    -----------\n    k: int\n        The number of clusters the algorithm will form.\n    max_iterations: int\n        The number of iterations the algorithm will run for if it does\n        not converge before that. \n    \"\"\"\n    def __init__(self, k=2, max_iterations=500):\n        self.k = k\n        self.max_iterations = max_iterations\n\n    def _init_random_centroids(self, X):\n        \"\"\" Initialize the centroids as k random samples of X\"\"\"\n        n_samples, n_features = np.shape(X)\n        centroids = np.zeros((self.k, n_features))\n        for i in range(self.k):\n            centroid = X[np.random.choice(range(n_samples))]\n            centroids[i] = centroid\n        return centroids\n\n    def _closest_centroid(self, sample, centroids):\n        \"\"\" Return the index of the closest centroid to the sample \"\"\"\n        closest_i = 0\n        closest_dist = float('inf')\n        for i, centroid in enumerate(centroids):\n            distance = euclidean_distance(sample, centroid)\n            if distance < closest_dist:\n                closest_i = i\n                closest_dist = distance\n        return closest_i\n\n    def _create_clusters(self, centroids, X):\n        \"\"\" Assign the samples to the closest centroids to create clusters \"\"\"\n        n_samples = np.shape(X)[0]\n        clusters = [[] for _ in range(self.k)]\n        for sample_i, sample in enumerate(X):\n            centroid_i = self._closest_centroid(sample, centroids)\n            clusters[centroid_i].append(sample_i)\n        return clusters\n\n    def _calculate_centroids(self, clusters, X):\n        \"\"\" Calculate new centroids as the means of the samples in each cluster  \"\"\"\n        n_features = np.shape(X)[1]\n        centroids = np.zeros((self.k, n_features))\n        for i, cluster in enumerate(clusters):\n            centroid = np.mean(X[cluster], axis=0)\n            centroids[i] = centroid\n        return centroids\n\n    def _get_cluster_labels(self, clusters, X):\n        \"\"\" Classify samples as the index of their clusters \"\"\"\n        # One prediction for each sample\n        y_pred = np.zeros(np.shape(X)[0])\n        for cluster_i, cluster in enumerate(clusters):\n            for sample_i in cluster:\n                y_pred[sample_i] = cluster_i\n        return y_pred\n\n    def predict(self, X):\n        \"\"\" Do K-Means clustering and return cluster indices \"\"\"\n\n        # Initialize centroids as k random samples from X\n        centroids = self._init_random_centroids(X)\n\n        # Iterate until convergence or for max iterations\n        for _ in range(self.max_iterations):\n            # Assign samples to closest centroids (create clusters)\n            clusters = self._create_clusters(centroids, X)\n            # Save current centroids for convergence check\n            prev_centroids = centroids\n            # Calculate new centroids from the clusters\n            centroids = self._calculate_centroids(clusters, X)\n            # If no centroids have changed => convergence\n            diff = centroids - prev_centroids\n            if not diff.any():\n                break\n\n        return self._get_cluster_labels(clusters, X)\n\n"
  },
  {
    "path": "mlfromscratch/unsupervised_learning/partitioning_around_medoids.py",
    "content": "from __future__ import print_function, division\nimport numpy as np\nfrom mlfromscratch.utils import normalize, euclidean_distance, Plot\nfrom mlfromscratch.unsupervised_learning import PCA\n\n\nclass PAM():\n    \"\"\"A simple clustering method that forms k clusters by first assigning\n    samples to the closest medoids, and then swapping medoids with non-medoid\n    samples if the total distance (cost) between the cluster members and their medoid\n    is smaller than prevoisly.\n\n\n    Parameters:\n    -----------\n    k: int\n        The number of clusters the algorithm will form.\n    \"\"\"\n    def __init__(self, k=2):\n        self.k = k\n\n    def _init_random_medoids(self, X):\n        \"\"\" Initialize the medoids as random samples \"\"\"\n        n_samples, n_features = np.shape(X)\n        medoids = np.zeros((self.k, n_features))\n        for i in range(self.k):\n            medoid = X[np.random.choice(range(n_samples))]\n            medoids[i] = medoid\n        return medoids\n\n    def _closest_medoid(self, sample, medoids):\n        \"\"\" Return the index of the closest medoid to the sample \"\"\"\n        closest_i = None\n        closest_distance = float(\"inf\")\n        for i, medoid in enumerate(medoids):\n            distance = euclidean_distance(sample, medoid)\n            if distance < closest_distance:\n                closest_i = i\n                closest_distance = distance\n        return closest_i\n\n    def _create_clusters(self, X, medoids):\n        \"\"\" Assign the samples to the closest medoids to create clusters \"\"\"\n        clusters = [[] for _ in range(self.k)]\n        for sample_i, sample in enumerate(X):\n            medoid_i = self._closest_medoid(sample, medoids)\n            clusters[medoid_i].append(sample_i)\n        return clusters\n\n    def _calculate_cost(self, X, clusters, medoids):\n        \"\"\" Calculate the cost (total distance between samples and their medoids) \"\"\"\n        cost = 0\n        # For each cluster\n        for i, cluster in enumerate(clusters):\n            medoid = medoids[i]\n            for sample_i in cluster:\n                # Add distance between sample and medoid as cost\n                cost += euclidean_distance(X[sample_i], medoid)\n        return cost\n\n    def _get_non_medoids(self, X, medoids):\n        \"\"\" Returns a list of all samples that are not currently medoids \"\"\"\n        non_medoids = []\n        for sample in X:\n            if not sample in medoids:\n                non_medoids.append(sample)\n        return non_medoids\n\n    def _get_cluster_labels(self, clusters, X):\n        \"\"\" Classify samples as the index of their clusters \"\"\"\n        # One prediction for each sample\n        y_pred = np.zeros(np.shape(X)[0])\n        for cluster_i in range(len(clusters)):\n            cluster = clusters[cluster_i]\n            for sample_i in cluster:\n                y_pred[sample_i] = cluster_i\n        return y_pred\n\n    def predict(self, X):\n        \"\"\" Do Partitioning Around Medoids and return the cluster labels \"\"\"\n        # Initialize medoids randomly\n        medoids = self._init_random_medoids(X)\n        # Assign samples to closest medoids\n        clusters = self._create_clusters(X, medoids)\n\n        # Calculate the initial cost (total distance between samples and\n        # corresponding medoids)\n        cost = self._calculate_cost(X, clusters, medoids)\n\n        # Iterate until we no longer have a cheaper cost\n        while True:\n            best_medoids = medoids\n            lowest_cost = cost\n            for medoid in medoids:\n                # Get all non-medoid samples\n                non_medoids = self._get_non_medoids(X, medoids)\n                # Calculate the cost when swapping medoid and samples\n                for sample in non_medoids:\n                    # Swap sample with the medoid\n                    new_medoids = medoids.copy()\n                    new_medoids[medoids == medoid] = sample\n                    # Assign samples to new medoids\n                    new_clusters = self._create_clusters(X, new_medoids)\n                    # Calculate the cost with the new set of medoids\n                    new_cost = self._calculate_cost(\n                        X, new_clusters, new_medoids)\n                    # If the swap gives us a lower cost we save the medoids and cost\n                    if new_cost < lowest_cost:\n                        lowest_cost = new_cost\n                        best_medoids = new_medoids\n            # If there was a swap that resultet in a lower cost we save the\n            # resulting medoids from the best swap and the new cost \n            if lowest_cost < cost:\n                cost = lowest_cost\n                medoids = best_medoids \n            # Else finished\n            else:\n                break\n\n        final_clusters = self._create_clusters(X, medoids)\n        # Return the samples cluster indices as labels\n        return self._get_cluster_labels(final_clusters, X)\n\n"
  },
  {
    "path": "mlfromscratch/unsupervised_learning/principal_component_analysis.py",
    "content": "from __future__ import print_function, division\nimport numpy as np\nfrom mlfromscratch.utils import calculate_covariance_matrix\n\n\nclass PCA():\n    \"\"\"A method for doing dimensionality reduction by transforming the feature\n    space to a lower dimensionality, removing correlation between features and\n    maximizing the variance along each feature axis. This class is also used throughout\n    the project to plot data.\n    \"\"\"\n    def transform(self, X, n_components):\n        \"\"\" Fit the dataset to the number of principal components specified in the\n        constructor and return the transformed dataset \"\"\"\n        covariance_matrix = calculate_covariance_matrix(X)\n\n        # Where (eigenvector[:,0] corresponds to eigenvalue[0])\n        eigenvalues, eigenvectors = np.linalg.eig(covariance_matrix)\n\n        # Sort the eigenvalues and corresponding eigenvectors from largest\n        # to smallest eigenvalue and select the first n_components\n        idx = eigenvalues.argsort()[::-1]\n        eigenvalues = eigenvalues[idx][:n_components]\n        eigenvectors = np.atleast_1d(eigenvectors[:, idx])[:, :n_components]\n\n        # Project the data onto principal components\n        X_transformed = X.dot(eigenvectors)\n\n        return X_transformed\n"
  },
  {
    "path": "mlfromscratch/unsupervised_learning/restricted_boltzmann_machine.py",
    "content": "import logging\nimport numpy as np\nimport progressbar\n\nfrom mlfromscratch.utils.misc import bar_widgets\nfrom mlfromscratch.utils import batch_iterator\nfrom mlfromscratch.deep_learning.activation_functions import Sigmoid\n\nsigmoid = Sigmoid()\n\nclass RBM():\n    \"\"\"Bernoulli Restricted Boltzmann Machine (RBM)\n\n    Parameters:\n    -----------\n    n_hidden: int:\n        The number of processing nodes (neurons) in the hidden layer. \n    learning_rate: float\n        The step length that will be used when updating the weights.\n    batch_size: int\n        The size of the mini-batch used to calculate each weight update.\n    n_iterations: float\n        The number of training iterations the algorithm will tune the weights for.\n\n    Reference:\n        A Practical Guide to Training Restricted Boltzmann Machines \n        URL: https://www.cs.toronto.edu/~hinton/absps/guideTR.pdf\n    \"\"\"\n    def __init__(self, n_hidden=128, learning_rate=0.1, batch_size=10, n_iterations=100):\n        self.n_iterations = n_iterations\n        self.batch_size = batch_size\n        self.lr = learning_rate\n        self.n_hidden = n_hidden\n        self.progressbar = progressbar.ProgressBar(widgets=bar_widgets)\n\n    def _initialize_weights(self, X):\n        n_visible = X.shape[1]\n        self.W = np.random.normal(scale=0.1, size=(n_visible, self.n_hidden))\n        self.v0 = np.zeros(n_visible)       # Bias visible\n        self.h0 = np.zeros(self.n_hidden)   # Bias hidden\n\n    def fit(self, X, y=None):\n        '''Contrastive Divergence training procedure'''\n\n        self._initialize_weights(X)\n\n        self.training_errors = []\n        self.training_reconstructions = []\n        for _ in self.progressbar(range(self.n_iterations)):\n            batch_errors = []\n            for batch in batch_iterator(X, batch_size=self.batch_size):\n                # Positive phase\n                positive_hidden = sigmoid(batch.dot(self.W) + self.h0)\n                hidden_states = self._sample(positive_hidden)\n                positive_associations = batch.T.dot(positive_hidden)\n\n                # Negative phase\n                negative_visible = sigmoid(hidden_states.dot(self.W.T) + self.v0)\n                negative_visible = self._sample(negative_visible)\n                negative_hidden = sigmoid(negative_visible.dot(self.W) + self.h0)\n                negative_associations = negative_visible.T.dot(negative_hidden)\n\n                self.W  += self.lr * (positive_associations - negative_associations)\n                self.h0 += self.lr * (positive_hidden.sum(axis=0) - negative_hidden.sum(axis=0))\n                self.v0 += self.lr * (batch.sum(axis=0) - negative_visible.sum(axis=0))\n\n                batch_errors.append(np.mean((batch - negative_visible) ** 2))\n\n            self.training_errors.append(np.mean(batch_errors))\n            # Reconstruct a batch of images from the training set\n            idx = np.random.choice(range(X.shape[0]), self.batch_size)\n            self.training_reconstructions.append(self.reconstruct(X[idx]))\n\n    def _sample(self, X):\n        return X > np.random.random_sample(size=X.shape)\n\n    def reconstruct(self, X):\n        positive_hidden = sigmoid(X.dot(self.W) + self.h0)\n        hidden_states = self._sample(positive_hidden)\n        negative_visible = sigmoid(hidden_states.dot(self.W.T) + self.v0)\n        return negative_visible\n\n"
  },
  {
    "path": "mlfromscratch/utils/__init__.py",
    "content": "from .misc import Plot\nfrom .data_manipulation import *\nfrom .data_operation import *"
  },
  {
    "path": "mlfromscratch/utils/data_manipulation.py",
    "content": "from __future__ import division\nfrom itertools import combinations_with_replacement\nimport numpy as np\nimport math\nimport sys\n\n\ndef shuffle_data(X, y, seed=None):\n    \"\"\" Random shuffle of the samples in X and y \"\"\"\n    if seed:\n        np.random.seed(seed)\n    idx = np.arange(X.shape[0])\n    np.random.shuffle(idx)\n    return X[idx], y[idx]\n\n\ndef batch_iterator(X, y=None, batch_size=64):\n    \"\"\" Simple batch generator \"\"\"\n    n_samples = X.shape[0]\n    for i in np.arange(0, n_samples, batch_size):\n        begin, end = i, min(i+batch_size, n_samples)\n        if y is not None:\n            yield X[begin:end], y[begin:end]\n        else:\n            yield X[begin:end]\n\n\ndef divide_on_feature(X, feature_i, threshold):\n    \"\"\" Divide dataset based on if sample value on feature index is larger than\n        the given threshold \"\"\"\n    split_func = None\n    if isinstance(threshold, int) or isinstance(threshold, float):\n        split_func = lambda sample: sample[feature_i] >= threshold\n    else:\n        split_func = lambda sample: sample[feature_i] == threshold\n\n    X_1 = np.array([sample for sample in X if split_func(sample)])\n    X_2 = np.array([sample for sample in X if not split_func(sample)])\n\n    return np.array([X_1, X_2])\n\n\ndef polynomial_features(X, degree):\n    n_samples, n_features = np.shape(X)\n\n    def index_combinations():\n        combs = [combinations_with_replacement(range(n_features), i) for i in range(0, degree + 1)]\n        flat_combs = [item for sublist in combs for item in sublist]\n        return flat_combs\n    \n    combinations = index_combinations()\n    n_output_features = len(combinations)\n    X_new = np.empty((n_samples, n_output_features))\n    \n    for i, index_combs in enumerate(combinations):  \n        X_new[:, i] = np.prod(X[:, index_combs], axis=1)\n\n    return X_new\n\n\ndef get_random_subsets(X, y, n_subsets, replacements=True):\n    \"\"\" Return random subsets (with replacements) of the data \"\"\"\n    n_samples = np.shape(X)[0]\n    # Concatenate x and y and do a random shuffle\n    X_y = np.concatenate((X, y.reshape((1, len(y))).T), axis=1)\n    np.random.shuffle(X_y)\n    subsets = []\n\n    # Uses 50% of training samples without replacements\n    subsample_size = int(n_samples // 2)\n    if replacements:\n        subsample_size = n_samples      # 100% with replacements\n\n    for _ in range(n_subsets):\n        idx = np.random.choice(\n            range(n_samples),\n            size=np.shape(range(subsample_size)),\n            replace=replacements)\n        X = X_y[idx][:, :-1]\n        y = X_y[idx][:, -1]\n        subsets.append([X, y])\n    return subsets\n\n\ndef normalize(X, axis=-1, order=2):\n    \"\"\" Normalize the dataset X \"\"\"\n    l2 = np.atleast_1d(np.linalg.norm(X, order, axis))\n    l2[l2 == 0] = 1\n    return X / np.expand_dims(l2, axis)\n\n\ndef standardize(X):\n    \"\"\" Standardize the dataset X \"\"\"\n    X_std = X\n    mean = X.mean(axis=0)\n    std = X.std(axis=0)\n    for col in range(np.shape(X)[1]):\n        if std[col]:\n            X_std[:, col] = (X_std[:, col] - mean[col]) / std[col]\n    # X_std = (X - X.mean(axis=0)) / X.std(axis=0)\n    return X_std\n\n\ndef train_test_split(X, y, test_size=0.5, shuffle=True, seed=None):\n    \"\"\" Split the data into train and test sets \"\"\"\n    if shuffle:\n        X, y = shuffle_data(X, y, seed)\n    # Split the training data from test data in the ratio specified in\n    # test_size\n    split_i = len(y) - int(len(y) // (1 / test_size))\n    X_train, X_test = X[:split_i], X[split_i:]\n    y_train, y_test = y[:split_i], y[split_i:]\n\n    return X_train, X_test, y_train, y_test\n\n\ndef k_fold_cross_validation_sets(X, y, k, shuffle=True):\n    \"\"\" Split the data into k sets of training / test data \"\"\"\n    if shuffle:\n        X, y = shuffle_data(X, y)\n\n    n_samples = len(y)\n    left_overs = {}\n    n_left_overs = (n_samples % k)\n    if n_left_overs != 0:\n        left_overs[\"X\"] = X[-n_left_overs:]\n        left_overs[\"y\"] = y[-n_left_overs:]\n        X = X[:-n_left_overs]\n        y = y[:-n_left_overs]\n\n    X_split = np.split(X, k)\n    y_split = np.split(y, k)\n    sets = []\n    for i in range(k):\n        X_test, y_test = X_split[i], y_split[i]\n        X_train = np.concatenate(X_split[:i] + X_split[i + 1:], axis=0)\n        y_train = np.concatenate(y_split[:i] + y_split[i + 1:], axis=0)\n        sets.append([X_train, X_test, y_train, y_test])\n\n    # Add left over samples to last set as training samples\n    if n_left_overs != 0:\n        np.append(sets[-1][0], left_overs[\"X\"], axis=0)\n        np.append(sets[-1][2], left_overs[\"y\"], axis=0)\n\n    return np.array(sets)\n\n\ndef to_categorical(x, n_col=None):\n    \"\"\" One-hot encoding of nominal values \"\"\"\n    if not n_col:\n        n_col = np.amax(x) + 1\n    one_hot = np.zeros((x.shape[0], n_col))\n    one_hot[np.arange(x.shape[0]), x] = 1\n    return one_hot\n\n\ndef to_nominal(x):\n    \"\"\" Conversion from one-hot encoding to nominal \"\"\"\n    return np.argmax(x, axis=1)\n\n\ndef make_diagonal(x):\n    \"\"\" Converts a vector into an diagonal matrix \"\"\"\n    m = np.zeros((len(x), len(x)))\n    for i in range(len(m[0])):\n        m[i, i] = x[i]\n    return m\n"
  },
  {
    "path": "mlfromscratch/utils/data_operation.py",
    "content": "from __future__ import division\nimport numpy as np\nimport math\nimport sys\n\n\ndef calculate_entropy(y):\n    \"\"\" Calculate the entropy of label array y \"\"\"\n    log2 = lambda x: math.log(x) / math.log(2)\n    unique_labels = np.unique(y)\n    entropy = 0\n    for label in unique_labels:\n        count = len(y[y == label])\n        p = count / len(y)\n        entropy += -p * log2(p)\n    return entropy\n\n\ndef mean_squared_error(y_true, y_pred):\n    \"\"\" Returns the mean squared error between y_true and y_pred \"\"\"\n    mse = np.mean(np.power(y_true - y_pred, 2))\n    return mse\n\n\ndef calculate_variance(X):\n    \"\"\" Return the variance of the features in dataset X \"\"\"\n    mean = np.ones(np.shape(X)) * X.mean(0)\n    n_samples = np.shape(X)[0]\n    variance = (1 / n_samples) * np.diag((X - mean).T.dot(X - mean))\n    \n    return variance\n\n\ndef calculate_std_dev(X):\n    \"\"\" Calculate the standard deviations of the features in dataset X \"\"\"\n    std_dev = np.sqrt(calculate_variance(X))\n    return std_dev\n\n\ndef euclidean_distance(x1, x2):\n    \"\"\" Calculates the l2 distance between two vectors \"\"\"\n    distance = 0\n    # Squared distance between each coordinate\n    for i in range(len(x1)):\n        distance += pow((x1[i] - x2[i]), 2)\n    return math.sqrt(distance)\n\n\ndef accuracy_score(y_true, y_pred):\n    \"\"\" Compare y_true to y_pred and return the accuracy \"\"\"\n    accuracy = np.sum(y_true == y_pred, axis=0) / len(y_true)\n    return accuracy\n\n\ndef calculate_covariance_matrix(X, Y=None):\n    \"\"\" Calculate the covariance matrix for the dataset X \"\"\"\n    if Y is None:\n        Y = X\n    n_samples = np.shape(X)[0]\n    covariance_matrix = (1 / (n_samples-1)) * (X - X.mean(axis=0)).T.dot(Y - Y.mean(axis=0))\n\n    return np.array(covariance_matrix, dtype=float)\n \n\ndef calculate_correlation_matrix(X, Y=None):\n    \"\"\" Calculate the correlation matrix for the dataset X \"\"\"\n    if Y is None:\n        Y = X\n    n_samples = np.shape(X)[0]\n    covariance = (1 / n_samples) * (X - X.mean(0)).T.dot(Y - Y.mean(0))\n    std_dev_X = np.expand_dims(calculate_std_dev(X), 1)\n    std_dev_y = np.expand_dims(calculate_std_dev(Y), 1)\n    correlation_matrix = np.divide(covariance, std_dev_X.dot(std_dev_y.T))\n\n    return np.array(correlation_matrix, dtype=float)\n"
  },
  {
    "path": "mlfromscratch/utils/kernels.py",
    "content": "import numpy as np\n\n\ndef linear_kernel(**kwargs):\n    def f(x1, x2):\n        return np.inner(x1, x2)\n    return f\n\n\ndef polynomial_kernel(power, coef, **kwargs):\n    def f(x1, x2):\n        return (np.inner(x1, x2) + coef)**power\n    return f\n\n\ndef rbf_kernel(gamma, **kwargs):\n    def f(x1, x2):\n        distance = np.linalg.norm(x1 - x2) ** 2\n        return np.exp(-gamma * distance)\n    return f\n"
  },
  {
    "path": "mlfromscratch/utils/misc.py",
    "content": "import progressbar\nfrom mpl_toolkits.mplot3d import Axes3D\nimport matplotlib.pyplot as plt\nimport matplotlib.cm as cmx\nimport matplotlib.colors as colors\nimport numpy as np\n\nfrom mlfromscratch.utils.data_operation import calculate_covariance_matrix\nfrom mlfromscratch.utils.data_operation import calculate_correlation_matrix\nfrom mlfromscratch.utils.data_manipulation import standardize\n\nbar_widgets = [\n    'Training: ', progressbar.Percentage(), ' ', progressbar.Bar(marker=\"-\", left=\"[\", right=\"]\"),\n    ' ', progressbar.ETA()\n]\n\nclass Plot():\n    def __init__(self): \n        self.cmap = plt.get_cmap('viridis')\n\n    def _transform(self, X, dim):\n        covariance = calculate_covariance_matrix(X)\n        eigenvalues, eigenvectors = np.linalg.eig(covariance)\n        # Sort eigenvalues and eigenvector by largest eigenvalues\n        idx = eigenvalues.argsort()[::-1]\n        eigenvalues = eigenvalues[idx][:dim]\n        eigenvectors = np.atleast_1d(eigenvectors[:, idx])[:, :dim]\n        # Project the data onto principal components\n        X_transformed = X.dot(eigenvectors)\n\n        return X_transformed\n\n\n    def plot_regression(self, lines, title, axis_labels=None, mse=None, scatter=None, legend={\"type\": \"lines\", \"loc\": \"lower right\"}):\n        \n        if scatter:\n            scatter_plots = scatter_labels = []\n            for s in scatter:\n                scatter_plots += [plt.scatter(s[\"x\"], s[\"y\"], color=s[\"color\"], s=s[\"size\"])]\n                scatter_labels += [s[\"label\"]]\n            scatter_plots = tuple(scatter_plots)\n            scatter_labels = tuple(scatter_labels)\n\n        for l in lines:\n            li = plt.plot(l[\"x\"], l[\"y\"], color=s[\"color\"], linewidth=l[\"width\"], label=l[\"label\"])\n\n        if mse:\n            plt.suptitle(title)\n            plt.title(\"MSE: %.2f\" % mse, fontsize=10)\n        else:\n            plt.title(title)\n\n        if axis_labels:\n            plt.xlabel(axis_labels[\"x\"])\n            plt.ylabel(axis_labels[\"y\"])\n\n        if legend[\"type\"] == \"lines\":\n            plt.legend(loc=\"lower_left\")\n        elif legend[\"type\"] == \"scatter\" and scatter:\n            plt.legend(scatter_plots, scatter_labels, loc=legend[\"loc\"])\n\n        plt.show()\n\n\n\n    # Plot the dataset X and the corresponding labels y in 2D using PCA.\n    def plot_in_2d(self, X, y=None, title=None, accuracy=None, legend_labels=None):\n        X_transformed = self._transform(X, dim=2)\n        x1 = X_transformed[:, 0]\n        x2 = X_transformed[:, 1]\n        class_distr = []\n\n        y = np.array(y).astype(int)\n\n        colors = [self.cmap(i) for i in np.linspace(0, 1, len(np.unique(y)))]\n\n        # Plot the different class distributions\n        for i, l in enumerate(np.unique(y)):\n            _x1 = x1[y == l]\n            _x2 = x2[y == l]\n            _y = y[y == l]\n            class_distr.append(plt.scatter(_x1, _x2, color=colors[i]))\n\n        # Plot legend\n        if not legend_labels is None: \n            plt.legend(class_distr, legend_labels, loc=1)\n\n        # Plot title\n        if title:\n            if accuracy:\n                perc = 100 * accuracy\n                plt.suptitle(title)\n                plt.title(\"Accuracy: %.1f%%\" % perc, fontsize=10)\n            else:\n                plt.title(title)\n\n        # Axis labels\n        plt.xlabel('Principal Component 1')\n        plt.ylabel('Principal Component 2')\n\n        plt.show()\n\n    # Plot the dataset X and the corresponding labels y in 3D using PCA.\n    def plot_in_3d(self, X, y=None):\n        X_transformed = self._transform(X, dim=3)\n        x1 = X_transformed[:, 0]\n        x2 = X_transformed[:, 1]\n        x3 = X_transformed[:, 2]\n        fig = plt.figure()\n        ax = fig.add_subplot(111, projection='3d')\n        ax.scatter(x1, x2, x3, c=y)\n        plt.show()\n\n\n"
  },
  {
    "path": "requirements.txt",
    "content": "matplotlib\nnumpy\nsklearn\npandas\ncvxopt\nscipy\nprogressbar33\nterminaltables\ngym\n"
  },
  {
    "path": "setup.cfg",
    "content": "[metadata]\ndescription-file = README.md\n\n[easy_install]\n\n"
  },
  {
    "path": "setup.py",
    "content": "from setuptools import setup, find_packages\nfrom codecs import open\nfrom os import path\n\n__version__ = '0.0.4'\n\nhere = path.abspath(path.dirname(__file__))\n\n# get the dependencies and installs\nwith open(path.join(here, 'requirements.txt'), encoding='utf-8') as f:\n    all_reqs = f.read().split('\\n')\n\ninstall_requires = [x.strip() for x in all_reqs if 'git+' not in x]\ndependency_links = [x.strip().replace('git+', '') for x in all_reqs if x.startswith('git+')]\n\nsetup(\n    name='mlfromscratch',\n    version=__version__,\n    description='Python implementations of some of the fundamental Machine Learning models and algorithms from scratch.',\n    url='https://github.com/eriklindernoren/ML-From-Scratch',\n    download_url='https://github.com/eriklindernoren/ML-From-Scratch/tarball/master',\n    license='MIT',\n    packages=find_packages(),\n    include_package_data=True,\n    author='Erik Linder-Noren',\n    install_requires=install_requires,\n    setup_requires=['numpy>=1.10', 'scipy>=0.17'],\n    dependency_links=dependency_links,\n    author_email='eriklindernoren@gmail.com'\n)"
  }
]