Full Code of PytLab/MLBox for AI

master e916cd8ff9c3 cached
65 files
821.8 KB
423.5k tokens
82 symbols
1 requests
Download .txt
Showing preview only (858K chars total). Download the full file or copy to clipboard to get everything.
Repository: PytLab/MLBox
Branch: master
Commit: e916cd8ff9c3
Files: 65
Total size: 821.8 KB

Directory structure:
gitextract_2fv0thbp/

├── .gitignore
├── README.md
├── Reinforcement Learning/
│   ├── Calculating State Utilities.ipynb
│   ├── Calculating Transition Probabilities.ipynb
│   ├── Defining Initial Distribution.ipynb
│   ├── Policy Iteration Algorithm.ipynb
│   ├── T.npy
│   └── Value Iteration Algorithm.ipynb
├── classification_and_regression_trees/
│   ├── bikeSpeedVsIq_test.txt
│   ├── bikeSpeedVsIq_train.txt
│   ├── compare.py
│   ├── dot/
│   │   ├── ex0.dot
│   │   ├── ex00.dot
│   │   ├── ex2.dot
│   │   ├── ex2_prune.dot
│   │   └── exp2.dot
│   ├── ex0.txt
│   ├── ex00.txt
│   ├── ex2.dot
│   ├── ex2.txt
│   ├── ex2test.txt
│   ├── exp.txt
│   ├── exp2.dot
│   ├── exp2.txt
│   ├── model_tree.py
│   ├── notebook/
│   │   ├── 分段函数回归树.ipynb
│   │   ├── 后剪枝.ipynb
│   │   └── 模型树对分段线性函数进行回归.ipynb
│   ├── prune.py
│   └── regression_tree.py
├── decision_tree/
│   ├── english_big.txt
│   ├── lenses.dot
│   ├── lenses.py
│   ├── lenses.txt
│   ├── sms_tree.dot
│   ├── sms_tree.pkl
│   ├── sms_tree.py
│   ├── sms_tree_2.dot
│   └── trees.py
├── linear_regression/
│   ├── abalone.txt
│   ├── ex0.txt
│   ├── ex1.txt
│   ├── lasso_regression.ipynb
│   ├── lasso_regression.py
│   ├── lasso_traj.ipynb
│   ├── lasso_ws
│   ├── local_weighted_linear_regression.py
│   ├── ridge_regression.ipynb
│   ├── ridge_regression.py
│   ├── stage_wise_regression.py
│   ├── stage_wise_traj.ipynb
│   └── standard_linear_regression.py
├── logistic_regression/
│   ├── english_big.txt
│   ├── logreg_grad_ascent.py
│   ├── logreg_stoch_grad_ascent.py
│   ├── sms.py
│   └── testSet.txt
├── naive_bayes/
│   ├── bayes.py
│   ├── english_big.txt
│   └── sms.py
└── support_vector_machine/
    ├── best_fit.py
    ├── svm_ga.py
    ├── svm_platt_smo.py
    ├── svm_simple_smo.py
    └── testSet.txt

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# pyenv
.python-version

# celery beat schedule file
celerybeat-schedule

# SageMath parsed files
*.sage.py

# dotenv
.env

# virtualenv
.venv
venv/
ENV/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/

.DS_Store

*.swp


================================================
FILE: README.md
================================================
# MLBox
Machine Learning Algorithms implementations

# Blogs
- [机器学习算法实践-决策树(Decision Tree)](http://pytlab.github.io/2017/07/09/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E7%AE%97%E6%B3%95%E5%AE%9E%E8%B7%B5-%E5%86%B3%E7%AD%96%E6%A0%91/)
- [机器学习算法实践-朴素贝叶斯(Naive Bayes)](http://pytlab.github.io/2017/07/11/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E5%AE%9E%E8%B7%B5-%E6%9C%B4%E7%B4%A0%E8%B4%9D%E5%8F%B6%E6%96%AF-Naive-Bayes/)
- [机器学习算法实践-Logistic回归与梯度上升算法(上)](http://pytlab.github.io/2017/07/13/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E7%AE%97%E6%B3%95%E5%AE%9E%E8%B7%B5-Logistic%E5%9B%9E%E5%BD%92%E4%B8%8E%E6%A2%AF%E5%BA%A6%E4%B8%8A%E5%8D%87%E7%AE%97%E6%B3%95-%E4%B8%8A/)
- [机器学习算法实践-Logistic回归与梯度上升算法(下)](http://pytlab.github.io/2017/07/15/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E7%AE%97%E6%B3%95%E5%AE%9E%E8%B7%B5-Logistic%E5%9B%9E%E5%BD%92%E4%B8%8E%E6%A2%AF%E5%BA%A6%E4%B8%8A%E5%8D%87%E7%AE%97%E6%B3%95-%E4%B8%8B/)
- [机器学习算法实践-支持向量机(SVM)算法原理](http://pytlab.github.io/2017/08/15/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E7%AE%97%E6%B3%95%E5%AE%9E%E8%B7%B5-%E6%94%AF%E6%8C%81%E5%90%91%E9%87%8F%E6%9C%BA-SVM-%E7%AE%97%E6%B3%95%E5%8E%9F%E7%90%86/)
- [机器学习算法实践-SVM核函数和软间隔](http://pytlab.github.io/2017/08/30/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E7%AE%97%E6%B3%95%E5%AE%9E%E8%B7%B5-SVM%E6%A0%B8%E5%87%BD%E6%95%B0%E5%92%8C%E8%BD%AF%E9%97%B4%E9%9A%94/)
- [机器学习算法实践-SVM中的SMO算法](http://pytlab.github.io/2017/09/01/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E7%AE%97%E6%B3%95%E5%AE%9E%E8%B7%B5-SVM%E4%B8%AD%E7%9A%84SMO%E7%AE%97%E6%B3%95/)
- [机器学习算法实践-Platt SMO和遗传算法优化SVM](http://pytlab.github.io/2017/10/15/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E7%AE%97%E6%B3%95%E5%AE%9E%E8%B7%B5-Platt-SMO%E5%92%8C%E9%81%97%E4%BC%A0%E7%AE%97%E6%B3%95%E4%BC%98%E5%8C%96SVM/)
- [机器学习算法实践-标准与局部加权线性回归](http://pytlab.github.io/2017/10/24/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E7%AE%97%E6%B3%95%E5%AE%9E%E8%B7%B5-%E6%A0%87%E5%87%86%E4%B8%8E%E5%B1%80%E9%83%A8%E5%8A%A0%E6%9D%83%E7%BA%BF%E6%80%A7%E5%9B%9E%E5%BD%92/)
- [机器学习算法实践-岭回归和LASSO](http://pytlab.github.io/2017/10/27/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E5%AE%9E%E8%B7%B5-%E5%B2%AD%E5%9B%9E%E5%BD%92%E5%92%8CLASSO%E5%9B%9E%E5%BD%92/)
- [机器学习算法实践-树回归](http://pytlab.github.io/2017/11/03/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E7%AE%97%E6%B3%95%E5%AE%9E%E8%B7%B5-%E6%A0%91%E5%9B%9E%E5%BD%92/)


================================================
FILE: Reinforcement Learning/Calculating State Utilities.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A MDP is a reinterpretation of Markov chains which includes an agent and a decision making process. A MDP is defined by these components:\n",
    "1. Set of possible States: S={s0,s1,...,sm}\n",
    "2. Initial State:s0\n",
    "3. Set of possible Actions:A={a0,a1,...,an}\n",
    "4. Transition Model:T(s,a,s′)\n",
    "5. Reward Function: R(s)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We are going to implement MDP in a grid world of 3 x 4 space where our agent/robot is situated at (1,1) in the beginning and needs to reach (3,4) state which is its desired goal state. There is also a fault state at (2,4) which the robot needs to avoid at all costs. The movement of the robot from one state to another earns it a reward. Naturally, the reward for the goal state is the highest and the least for the fault state. The objective of the robot is to maximize its reward and thus plan its movements/actions accordingly. It can move in any direction and this is a stochastic process."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To compare the states, we calculate the utility of these states and this is shown below:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "def state_utility(v, T, u, reward, gamma):\n",
    "    \n",
    "    #v is the state vector\n",
    "    #T is the transition matrix\n",
    "    #u is the utility vector\n",
    "    #reward consists of the rewards earned for moving to a particular state\n",
    "    #gamma is the discount factor by which rewards are discounted over the time\n",
    "\n",
    "    action_array = np.zeros(4)\n",
    "    for action in range(0, 4):\n",
    "        action_array[action] = np.sum(np.multiply(u, np.dot(v, T[:,:,action])))\n",
    "    return reward + gamma * np.max(action_array)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Utility of state (1,1): 0.7056\n"
     ]
    }
   ],
   "source": [
    "def main():\n",
    "    \n",
    "    #The agent starts from (1, 1)\n",
    "    v = np.array([[0.0, 0.0, 0.0, 0.0, \n",
    "                   0.0, 0.0, 0.0, 0.0, \n",
    "                   1.0, 0.0, 0.0, 0.0]])\n",
    "    \n",
    "    #file loaded from the folder\n",
    "    T = np.load(\"T.npy\")\n",
    "\n",
    "    #Utility vector\n",
    "    u = np.array([[0.812, 0.868, 0.918,   1.0,\n",
    "                   0.762,   0.0, 0.660,  -1.0,\n",
    "                   0.705, 0.655, 0.611, 0.388]])\n",
    "\n",
    "    #Define the reward for state (1,1)\n",
    "    reward = -0.04\n",
    "    #Assume that the discount factor is equal to 1.0\n",
    "    gamma = 1.0\n",
    "\n",
    "    #Use the Bellman equation to find the utility of state (1,1)\n",
    "    utility_11 = state_utility(v, T, u, reward, gamma)\n",
    "    print(\"Utility of state (1,1): \" + str(utility_11))\n",
    "\n",
    "if __name__ == \"__main__\":\n",
    "    main()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: Reinforcement Learning/Calculating Transition Probabilities.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " 1. Set of possible states : S  = {s0,s1,s2,......,sn} \n",
    " 2. Initial State: s0 \n",
    " 3. Transition Model: T(s,s')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let’s suppose we have a chain with only two states s0 and s1, where s0 is the initial state. The process is in s0 90% of the time and it can move to s1 the remaining 10% of the time. When the process is in state s1 it will remain there 50% of the time. Given this data we can create a Transition Matrix T as follows:\n",
    "T=[[0.90 0.10]\n",
    "   [0.50 0.50]]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Computing the k-step transition probability:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "T: [[0.9 0.1]\n",
      " [0.5 0.5]]\n",
      "T_5: [[0.83504 0.16496]\n",
      " [0.8248  0.1752 ]]\n",
      "T_25: [[0.83333333 0.16666667]\n",
      " [0.83333333 0.16666667]]\n",
      "T_50: [[0.83333333 0.16666667]\n",
      " [0.83333333 0.16666667]]\n",
      "T_100: [[0.83333333 0.16666667]\n",
      " [0.83333333 0.16666667]]\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "\n",
    "#Here we declare the Transition Matrix T\n",
    "T = np.array([[0.90, 0.10],\n",
    "              [0.50, 0.50]])\n",
    "\n",
    "#Obtain T after 5 steps\n",
    "T_5 = np.linalg.matrix_power(T, 5)\n",
    "\n",
    "#Obtain T after 25 steps\n",
    "T_25 = np.linalg.matrix_power(T, 25)\n",
    "\n",
    "#Obtain T after 50 steps\n",
    "T_50 = np.linalg.matrix_power(T, 50)\n",
    "\n",
    "#Obtain T after 100 steps\n",
    "T_100 = np.linalg.matrix_power(T, 100)\n",
    "\n",
    "#Print the matrices\n",
    "print(\"T: \" + str(T))\n",
    "print(\"T_5: \" + str(T_5))\n",
    "print(\"T_25: \" + str(T_25))\n",
    "print(\"T_50: \" + str(T_50))\n",
    "print(\"T_100: \" + str(T_100))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: Reinforcement Learning/Defining Initial Distribution.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us now define the initial distribution which represents the state of the system at k=0.\n",
    "Our system is composed of two states and we can model the initial distribution as a vector with two elements, the first element of the vector represents the probability of staying in the state s0 and the second element the probability of staying in state s1. Let’s suppose that we start from s0, the vector v representing the initial distribution will have this form:\n",
    "v=(1,0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Calculating the probability of being in a specific state after k iterations:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "v: [[1. 0.]]\n",
      "v_1: [[0.9 0.1]]\n",
      "v_5: [[0.83504 0.16496]]\n",
      "v_25: [[0.83333333 0.16666667]]\n",
      "v_50: [[0.83333333 0.16666667]]\n",
      "v_100: [[0.83333333 0.16666667]]\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "\n",
    "#Declare the initial distribution\n",
    "v = np.array([[1.0, 0.0]])\n",
    "\n",
    "#Declare the Transition Matrix T(this is the same matrix used as in the file'Calculating Transition Probabilities')\n",
    "T = np.array([[0.90, 0.10],\n",
    "              [0.50, 0.50]])\n",
    "\n",
    "#Obtain T after 5 steps\n",
    "T_5 = np.linalg.matrix_power(T, 5)\n",
    "\n",
    "#Obtain T after 25 steps\n",
    "T_25 = np.linalg.matrix_power(T, 25)\n",
    "\n",
    "#Obtain T after 50 steps\n",
    "T_50 = np.linalg.matrix_power(T, 50)\n",
    "\n",
    "#Obtain T after 100 steps\n",
    "T_100 = np.linalg.matrix_power(T, 100)\n",
    "\n",
    "#Printing the initial distribution\n",
    "print(\"v: \" + str(v))\n",
    "print(\"v_1: \" + str(np.dot(v,T)))\n",
    "print(\"v_5: \" + str(np.dot(v,T_5)))\n",
    "print(\"v_25: \" + str(np.dot(v,T_25)))\n",
    "print(\"v_50: \" + str(np.dot(v,T_50)))\n",
    "print(\"v_100: \" + str(np.dot(v,T_100)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The result after 50 and 100 iterations are the same and v_50 is equal to v_100 no matter which starting distribution we have. The chain converged to equilibrium meaning that as the time progresses it forgets about the starting distribution."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: Reinforcement Learning/Policy Iteration Algorithm.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Policy iteration is guaranteed to converge and at convergence, the current policy and its utility function are the optimal policy and the optimal utility function. First of all, we define a policy π which assigns an action to each state. We can assign random actions to this policy, it does not matter.\n",
    "Once we evaluate the policy we can improve it. The policy improvement is the second and last step of the algorithm. Our environment has a finite number of states and then a finite number of policies. Each iteration yields to a better policy."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Implementing the policy iteration algorithm:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "def return_policy_evaluation(p, u, r, T, gamma):\n",
    "\n",
    "    #v is the state vector\n",
    "    #T is the transition matrix\n",
    "    #u is the utility vector\n",
    "    #reward consists of the rewards earned for moving to a particular state\n",
    "    #gamma is the discount factor by which rewards are discounted over the time\n",
    "    for s in range(12):\n",
    "        if not np.isnan(p[s]):\n",
    "            v = np.zeros((1,12))\n",
    "            v[0,s] = 1.0\n",
    "            action = int(p[s])\n",
    "            u[s] = r[s] + gamma * np.sum(np.multiply(u, np.dot(v, T[:,:,action])))\n",
    "    return u"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "def return_expected_action(u, T, v):\n",
    "    \n",
    "#    It returns an action based on the\n",
    "#    expected utility of doing a in state s, \n",
    "#    according to T and u. This action is\n",
    "#    the one that maximize the expected\n",
    "#    utility.\n",
    "    \n",
    "    actions_array = np.zeros(4)\n",
    "    for action in range(4):\n",
    "       #Expected utility of doing a in state s, according to T and u.\n",
    "       actions_array[action] = np.sum(np.multiply(u, np.dot(v, T[:,:,action])))\n",
    "    return np.argmax(actions_array)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "def print_policy(p, shape):\n",
    "    \"\"\"Printing utility.\n",
    "\n",
    "    Print the policy actions using symbols:\n",
    "    ^, v, <, > up, down, left, right\n",
    "    * terminal states\n",
    "    # obstacles\n",
    "    \"\"\"\n",
    "    counter = 0\n",
    "    policy_string = \"\"\n",
    "    for row in range(shape[0]):\n",
    "        for col in range(shape[1]):\n",
    "            if(p[counter] == -1): policy_string += \" *  \"            \n",
    "            elif(p[counter] == 0): policy_string += \" ^  \"\n",
    "            elif(p[counter] == 1): policy_string += \" <  \"\n",
    "            elif(p[counter] == 2): policy_string += \" v  \"           \n",
    "            elif(p[counter] == 3): policy_string += \" >  \"\n",
    "            elif(np.isnan(p[counter])): policy_string += \" #  \"\n",
    "            counter += 1\n",
    "        policy_string += '\\n'\n",
    "    print(policy_string)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " v   <   >   *  \n",
      " ^   #   <   *  \n",
      " <   <   ^   v  \n",
      "\n",
      " ^   >   >   *  \n",
      " ^   #   ^   *  \n",
      " <   >   ^   v  \n",
      "\n",
      " >   >   >   *  \n",
      " ^   #   ^   *  \n",
      " >   >   ^   <  \n",
      "\n",
      " >   >   >   *  \n",
      " ^   #   ^   *  \n",
      " >   >   ^   <  \n",
      "\n",
      " >   >   >   *  \n",
      " ^   #   ^   *  \n",
      " ^   >   ^   <  \n",
      "\n",
      " >   >   >   *  \n",
      " ^   #   ^   *  \n",
      " ^   >   ^   <  \n",
      "\n",
      " >   >   >   *  \n",
      " ^   #   ^   *  \n",
      " ^   <   ^   <  \n",
      "\n",
      " >   >   >   *  \n",
      " ^   #   ^   *  \n",
      " ^   <   ^   <  \n",
      "\n",
      " >   >   >   *  \n",
      " ^   #   ^   *  \n",
      " ^   <   ^   <  \n",
      "\n",
      " >   >   >   *  \n",
      " ^   #   ^   *  \n",
      " ^   <   <   <  \n",
      "\n",
      " >   >   >   *  \n",
      " ^   #   ^   *  \n",
      " ^   <   <   <  \n",
      "\n",
      " >   >   >   *  \n",
      " ^   #   ^   *  \n",
      " ^   <   <   <  \n",
      "\n",
      " >   >   >   *  \n",
      " ^   #   ^   *  \n",
      " ^   <   <   <  \n",
      "\n",
      " >   >   >   *  \n",
      " ^   #   ^   *  \n",
      " ^   <   <   <  \n",
      "\n",
      " >   >   >   *  \n",
      " ^   #   ^   *  \n",
      " ^   <   <   <  \n",
      "\n",
      " >   >   >   *  \n",
      " ^   #   ^   *  \n",
      " ^   <   <   <  \n",
      "\n",
      " >   >   >   *  \n",
      " ^   #   ^   *  \n",
      " ^   <   <   <  \n",
      "\n",
      " >   >   >   *  \n",
      " ^   #   ^   *  \n",
      " ^   <   <   <  \n",
      "\n",
      " >   >   >   *  \n",
      " ^   #   ^   *  \n",
      " ^   <   <   <  \n",
      "\n",
      " >   >   >   *  \n",
      " ^   #   ^   *  \n",
      " ^   <   <   <  \n",
      "\n",
      " >   >   >   *  \n",
      " ^   #   ^   *  \n",
      " ^   <   <   <  \n",
      "\n",
      "=================== FINAL RESULT ==================\n",
      "Iterations: 22\n",
      "Delta: 9.043213450299348e-08\n",
      "Gamma: 0.999\n",
      "Epsilon: 0.0001\n",
      "===================================================\n",
      "[0.80796344 0.86539911 0.91653199 1.        ]\n",
      "[ 0.75696624  0.          0.65836281 -1.        ]\n",
      "[0.69968295 0.64882105 0.60471972 0.38150427]\n",
      "===================================================\n",
      " >   >   >   *  \n",
      " ^   #   ^   *  \n",
      " ^   <   <   <  \n",
      "\n",
      "===================================================\n"
     ]
    }
   ],
   "source": [
    "def main():\n",
    "    gamma = 0.999\n",
    "    epsilon = 0.0001\n",
    "    iteration = 0\n",
    "    T = np.load(\"T.npy\")\n",
    "    #Generate the first policy randomly\n",
    "    # NaN=Nothing, -1=Terminal, 0=Up, 1=Left, 2=Down, 3=Right\n",
    "    p = np.random.randint(0, 4, size=(12)).astype(np.float32)\n",
    "    p[5] = np.NaN\n",
    "    p[3] = p[7] = -1\n",
    "    #Utility vectors\n",
    "    u = np.array([0.0, 0.0, 0.0,  0.0,\n",
    "                  0.0, 0.0, 0.0,  0.0,\n",
    "                  0.0, 0.0, 0.0,  0.0])\n",
    "    #Reward vector\n",
    "    r = np.array([-0.04, -0.04, -0.04,  +1.0,\n",
    "                  -0.04,   0.0, -0.04,  -1.0,\n",
    "                  -0.04, -0.04, -0.04, -0.04])\n",
    "\n",
    "    while True:\n",
    "        iteration += 1\n",
    "        #1- Policy evaluation\n",
    "        u_0 = u.copy()\n",
    "        u = return_policy_evaluation(p, u, r, T, gamma)\n",
    "        #Stopping criteria\n",
    "        delta = np.absolute(u - u_0).max()\n",
    "        if delta < epsilon * (1 - gamma) / gamma: break\n",
    "        for s in range(12):\n",
    "            if not np.isnan(p[s]) and not p[s]==-1:\n",
    "                v = np.zeros((1,12))\n",
    "                v[0,s] = 1.0\n",
    "                #2- Policy improvement\n",
    "                a = return_expected_action(u, T, v)         \n",
    "                if a != p[s]: p[s] = a\n",
    "        print_policy(p, shape=(3,4))\n",
    "\n",
    "    print(\"=================== FINAL RESULT ==================\")\n",
    "    print(\"Iterations: \" + str(iteration))\n",
    "    print(\"Delta: \" + str(delta))\n",
    "    print(\"Gamma: \" + str(gamma))\n",
    "    print(\"Epsilon: \" + str(epsilon))\n",
    "    print(\"===================================================\")\n",
    "    print(u[0:4])\n",
    "    print(u[4:8])\n",
    "    print(u[8:12])\n",
    "    print(\"===================================================\")\n",
    "    print_policy(p, shape=(3,4))\n",
    "    print(\"===================================================\")\n",
    "\n",
    "if __name__ == \"__main__\":\n",
    "    main()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: Reinforcement Learning/Value Iteration Algorithm.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Value Iteration algorithm uses the calculated utilities of all the states and compares them after an equilibrium is reached to calculate which is the best move to be taken. The algorithm reaches an equlibrium and this can be known using a stopping criteria. The stopping criteria taken is when no state's utility gets changed by much between two consecutive iterations."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Implementing the Value Iteration algorithm:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "def state_utility(v, T, u, reward, gamma):\n",
    "    \n",
    "    #v is the state vector\n",
    "    #T is the transition matrix\n",
    "    #u is the utility vector\n",
    "    #reward consists of the rewards earned for moving to a particular state\n",
    "    #gamma is the discount factor by which rewards are discounted over the time\n",
    "\n",
    "    action_array = np.zeros(4)\n",
    "    for action in range(0, 4):\n",
    "        action_array[action] = np.sum(np.multiply(u, np.dot(v, T[:,:,action])))\n",
    "    return reward + gamma * np.max(action_array)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "=================== FINAL RESULT ==================\n",
      "Iterations: 26\n",
      "Delta: 9.511968687869743e-06\n",
      "Gamma: 0.999\n",
      "Epsilon: 0.01\n",
      "===================================================\n",
      "[0.80796341 0.86539911 0.91653199 1.        ]\n",
      "[ 0.75696613  0.          0.65836281 -1.        ]\n",
      "[0.69968168 0.64881721 0.60471137 0.3814863 ]\n",
      "===================================================\n"
     ]
    }
   ],
   "source": [
    "def main():\n",
    "    \n",
    "    tot_states = 12\n",
    "    gamma = 0.999 \n",
    "    iteration = 0 #Iteration counter\n",
    "    epsilon = 0.01 #Stopping criteria given a small value\n",
    "\n",
    "    #List containing the data for each iteation\n",
    "    graph_list = list()\n",
    "\n",
    "    #Transition matrix loaded from file\n",
    "    T = np.load(\"T.npy\")\n",
    "\n",
    "    #Reward vector\n",
    "    r = np.array([-0.04, -0.04, -0.04,  +1.0,\n",
    "                  -0.04,   0.0, -0.04,  -1.0,\n",
    "                  -0.04, -0.04, -0.04, -0.04])    \n",
    "\n",
    "    #Utility vectors\n",
    "    u = np.array([0.0, 0.0, 0.0,  0.0,\n",
    "                   0.0, 0.0, 0.0,  0.0,\n",
    "                   0.0, 0.0, 0.0,  0.0])\n",
    "    \n",
    "    u1 = np.array([0.0, 0.0, 0.0,  0.0,\n",
    "                    0.0, 0.0, 0.0,  0.0,\n",
    "                    0.0, 0.0, 0.0,  0.0])\n",
    "\n",
    "    while True:\n",
    "        delta = 0\n",
    "        u = u1.copy()\n",
    "        iteration += 1\n",
    "        graph_list.append(u)\n",
    "        for s in range(tot_states):\n",
    "            reward = r[s]\n",
    "            v = np.zeros((1,tot_states))\n",
    "            v[0,s] = 1.0\n",
    "            u1[s] = state_utility(v, T, u, reward, gamma)\n",
    "            delta = max(delta, np.abs(u1[s] - u[s])) #Stopping criteria checked    \n",
    "            \n",
    "        if delta < epsilon * (1 - gamma) / gamma:\n",
    "                print(\"=================== FINAL RESULT ==================\")\n",
    "                print(\"Iterations: \" + str(iteration))\n",
    "                print(\"Delta: \" + str(delta))\n",
    "                print(\"Gamma: \" + str(gamma))\n",
    "                print(\"Epsilon: \" + str(epsilon))\n",
    "                print(\"===================================================\")\n",
    "                print(u[0:4])\n",
    "                print(u[4:8])\n",
    "                print(u[8:12])\n",
    "                print(\"===================================================\")\n",
    "                break\n",
    "\n",
    "if __name__ == \"__main__\":\n",
    "    main()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: classification_and_regression_trees/bikeSpeedVsIq_test.txt
================================================
12.000000	121.010516
19.000000	157.337044
12.000000	116.031825
15.000000	132.124872
2.000000	52.719612
6.000000	39.058368
3.000000	50.757763
20.000000	166.740333
11.000000	115.808227
21.000000	165.582995
3.000000	41.956087
3.000000	34.432370
13.000000	116.954676
1.000000	32.112553
7.000000	50.380243
7.000000	94.107791
23.000000	188.943179
18.000000	152.637773
9.000000	104.122082
18.000000	127.805226
0.000000	83.083232
15.000000	148.180104
3.000000	38.480247
8.000000	77.597839
7.000000	75.625803
11.000000	124.620208
13.000000	125.186698
5.000000	51.165922
3.000000	31.179113
15.000000	132.505727
19.000000	137.978043
9.000000	106.481123
20.000000	172.149955
11.000000	104.116556
4.000000	22.457996
20.000000	175.735047
18.000000	165.350412
22.000000	177.461724
16.000000	138.672986
17.000000	156.791788
19.000000	150.327544
19.000000	156.992196
23.000000	163.624262
8.000000	92.537227
3.000000	32.341399
16.000000	144.445614
11.000000	119.985586
16.000000	145.149335
12.000000	113.284662
5.000000	47.742716
11.000000	115.852585
3.000000	31.579325
1.000000	43.758671
1.000000	61.049125
13.000000	132.751826
23.000000	163.233087
12.000000	115.134296
8.000000	91.370839
8.000000	86.137955
14.000000	120.857934
3.000000	33.777477
10.000000	110.831763
10.000000	104.174775
20.000000	155.920696
4.000000	30.619132
0.000000	71.880474
7.000000	86.399516
7.000000	72.632906
5.000000	58.632985
18.000000	143.584511
23.000000	187.059504
6.000000	65.067119
6.000000	69.110280
19.000000	142.388056
15.000000	137.174489
21.000000	159.719092
9.000000	102.179638
20.000000	176.416294
21.000000	146.516385
18.000000	147.808343
23.000000	154.790810
16.000000	137.385285
18.000000	166.885975
15.000000	136.989000
20.000000	144.668679
14.000000	137.060671
19.000000	140.468283
11.000000	98.344084
16.000000	132.497910
1.000000	59.143101
20.000000	152.299381
13.000000	134.487271
0.000000	77.805718
3.000000	28.543764
10.000000	97.751817
4.000000	41.223659
11.000000	110.017015
12.000000	119.391386
20.000000	158.872126
2.000000	38.776222
19.000000	150.496148
15.000000	131.505967
22.000000	179.856157
13.000000	143.090102
14.000000	142.611861
13.000000	120.757410
4.000000	27.929324
16.000000	151.530849
15.000000	148.149702
5.000000	44.188084
16.000000	141.135406
12.000000	119.817665
8.000000	80.991524
3.000000	29.308640
6.000000	48.203468
8.000000	92.179834
22.000000	162.720371
10.000000	91.971158
2.000000	33.481943
8.000000	88.528612
1.000000	54.042173
8.000000	92.002928
5.000000	45.614646
3.000000	34.319635
14.000000	129.140558
17.000000	146.807901
17.000000	157.694058
4.000000	37.080929
20.000000	169.942381
10.000000	114.675638
5.000000	34.913029
14.000000	137.889747
0.000000	79.043129
16.000000	139.084390
6.000000	53.340135
13.000000	142.772612
0.000000	73.103173
3.000000	37.717487
15.000000	134.116395
18.000000	138.748257
23.000000	180.779121
10.000000	93.721894
23.000000	166.958335
6.000000	74.473589
6.000000	73.006291
3.000000	34.178656
1.000000	33.395482
22.000000	149.933384
18.000000	154.858982
6.000000	66.121084
1.000000	60.816800
5.000000	55.681020
6.000000	61.251558
15.000000	125.452206
16.000000	134.310255
19.000000	167.999681
5.000000	40.074830
22.000000	162.658997
12.000000	109.473909
4.000000	44.743405
11.000000	122.419496
14.000000	139.852014
21.000000	160.045407
15.000000	131.999358
15.000000	135.577799
20.000000	173.494629
8.000000	82.497177
12.000000	123.122032
10.000000	97.592026
16.000000	141.345706
8.000000	79.588881
3.000000	54.308878
4.000000	36.112937
19.000000	165.005336
23.000000	172.198031
15.000000	127.699625
1.000000	47.305217
13.000000	115.489379
8.000000	103.956569
4.000000	53.669477
0.000000	76.220652
12.000000	114.153306
6.000000	74.608728
3.000000	41.339299
5.000000	21.944048
22.000000	181.455655
20.000000	171.691444
10.000000	104.299002
21.000000	168.307123
20.000000	169.556523
23.000000	175.960552
1.000000	42.554778
14.000000	137.286185
16.000000	136.126561
12.000000	119.269042
6.000000	63.426977
4.000000	27.728212
4.000000	32.687588
23.000000	151.153204
15.000000	129.767331


================================================
FILE: classification_and_regression_trees/bikeSpeedVsIq_train.txt
================================================
3.000000	46.852122
23.000000	178.676107
0.000000	86.154024
6.000000	68.707614
15.000000	139.737693
17.000000	141.988903
12.000000	94.477135
8.000000	86.083788
9.000000	97.265824
7.000000	80.400027
8.000000	83.414554
1.000000	52.525471
16.000000	127.060008
9.000000	101.639269
14.000000	146.412680
15.000000	144.157101
17.000000	152.699910
19.000000	136.669023
21.000000	166.971736
21.000000	165.467251
3.000000	38.455193
6.000000	75.557721
4.000000	22.171763
5.000000	50.321915
0.000000	74.412428
5.000000	42.052392
1.000000	42.489057
14.000000	139.185416
21.000000	140.713725
5.000000	63.222944
5.000000	56.294626
9.000000	91.674826
22.000000	173.497655
17.000000	152.692482
9.000000	113.920633
1.000000	51.552411
9.000000	100.075315
16.000000	137.803868
18.000000	135.925777
3.000000	45.550762
16.000000	149.933224
2.000000	27.914173
6.000000	62.103546
20.000000	173.942381
12.000000	119.200505
6.000000	70.730214
16.000000	156.260832
15.000000	132.467643
19.000000	161.164086
17.000000	138.031844
23.000000	169.747881
11.000000	116.761920
4.000000	34.305905
6.000000	68.841160
10.000000	119.535227
20.000000	158.104763
18.000000	138.390511
5.000000	59.375794
7.000000	80.802300
11.000000	108.611485
10.000000	91.169028
15.000000	154.104819
5.000000	51.100287
3.000000	32.334330
15.000000	150.551655
10.000000	111.023073
0.000000	87.489950
2.000000	46.726299
7.000000	92.540440
15.000000	135.715438
19.000000	152.960552
19.000000	162.789223
21.000000	167.176240
22.000000	164.323358
12.000000	104.823071
1.000000	35.554328
11.000000	114.784640
1.000000	36.819570
12.000000	130.266826
12.000000	126.053312
18.000000	153.378289
7.000000	70.089159
15.000000	139.528624
19.000000	157.137999
23.000000	183.595248
7.000000	73.431043
11.000000	128.176167
22.000000	183.181247
13.000000	112.685801
18.000000	161.634783
6.000000	63.169478
7.000000	63.393975
19.000000	165.779578
14.000000	143.973398
22.000000	185.131852
3.000000	45.275591
6.000000	62.018003
0.000000	83.193398
7.000000	76.847802
19.000000	147.087386
7.000000	62.812086
1.000000	49.910068
11.000000	102.169335
11.000000	105.108121
6.000000	63.429817
12.000000	121.301542
17.000000	163.253962
13.000000	119.588698
0.000000	87.333807
20.000000	144.484066
21.000000	168.792482
23.000000	159.751246
20.000000	162.843592
14.000000	145.664069
19.000000	146.838515
12.000000	132.049377
18.000000	155.756119
22.000000	155.686345
7.000000	73.913958
1.000000	66.761881
7.000000	65.855450
6.000000	56.271026
19.000000	155.308523
12.000000	124.372873
17.000000	136.025960
14.000000	132.996861
21.000000	172.639791
17.000000	135.672594
8.000000	90.323742
5.000000	62.462698
16.000000	159.048794
14.000000	139.991227
3.000000	37.026678
9.000000	100.839901
9.000000	93.097395
15.000000	123.645221
15.000000	147.327185
1.000000	40.055830
0.000000	88.192829
17.000000	139.174517
22.000000	169.354493
17.000000	136.354272
9.000000	90.692829
7.000000	63.987997
14.000000	128.972231
10.000000	108.433394
2.000000	49.321034
19.000000	171.615671
9.000000	97.894855
0.000000	68.962453
9.000000	72.063371
22.000000	157.000070
12.000000	114.461754
6.000000	58.239465
9.000000	104.601048
8.000000	90.772359
22.000000	164.428791
5.000000	34.804083
5.000000	37.089459
22.000000	177.987605
10.000000	89.439608
6.000000	70.711362
23.000000	181.731482
20.000000	151.538932
7.000000	66.067228
6.000000	61.565125
20.000000	184.441687
9.000000	91.569158
9.000000	98.833425
17.000000	144.352866
9.000000	94.498314
15.000000	121.922732
18.000000	166.408274
10.000000	89.571299
8.000000	75.373772
22.000000	161.001478
8.000000	90.594227
5.000000	57.180933
20.000000	161.643007
8.000000	87.197370
8.000000	95.584308
15.000000	126.207221
7.000000	84.528209
18.000000	161.056986
10.000000	86.762615
1.000000	33.325906
9.000000	105.095502
2.000000	22.440421
9.000000	93.449284
14.000000	106.249595
21.000000	163.254385
22.000000	161.746628
20.000000	152.973085
17.000000	122.918987
7.000000	58.536412
1.000000	45.013277
13.000000	137.294148
10.000000	88.123737
2.000000	45.847376
20.000000	163.385797


================================================
FILE: classification_and_regression_trees/compare.py
================================================
#!/usr/bin/env python
# -*- coding: utf-8 -*-

from regression_tree import *
from model_tree import linear_regression

def get_corrcoef(X, Y):
    # X Y 的协方差
    cov = np.mean(X*Y) - np.mean(X)*np.mean(Y)
    return cov/(np.var(X)*np.var(Y))**0.5

if '__main__' == __name__:
    # 加载数据
    data_train = load_data('bikeSpeedVsIq_train.txt')
    data_test = load_data('bikeSpeedVsIq_test.txt')

    dataset_test = np.matrix(data_test)
    m, n = dataset_test.shape
    testset = np.ones((m, n+1))
    testset[:, 1:] = dataset_test
    X_test, y_test = testset[:, :-1], testset[:, -1]

    # 获取标准线性回归模型
    w, X, y = linear_regression(data_train)
    y_lr = X_test*w
    y_test = np.array(y_test).T
    y_lr = np.array(y_lr).T[0]
    corrcoef_lr = get_corrcoef(y_test, y_lr)
    print('linear regression correlation coefficient: {}'.format(corrcoef_lr))

    # 获取模型树回归模型
    tree = create_tree(data_train, fleaf, ferr, opt={'err_tolerance': 1,
                                                     'n_tolerance': 4})
    y_tree = [tree_predict([x], tree) for x in X_test[:, 1].tolist()]
    corrcoef_tree = get_corrcoef(np.array(y_tree), y_test)
    print('regression tree correlation coefficient: {}'.format(corrcoef_tree))

    plt.scatter(np.array(data_train)[:, 0], np.array(data_train)[:, 1])
    # 绘制线性回归曲线
    x = np.sort([i for i in X_test[:, 1].tolist()])
    y = [np.dot([1.0, i], np.array(w.T).tolist()[0]) for i in x]
    plt.plot(x, y, c='r')

    # 绘制回归树回归曲线
    y = [tree_predict([i], tree) for i in x]
    plt.plot(x, y, c='y')
    plt.show()



================================================
FILE: classification_and_regression_trees/dot/ex0.dot
================================================
digraph decision_tree {
    "5db27cbb-29af-4987-9cd2-9217c781000d" [label="0: 0.400158"];
    "a81daf61-ab07-4e65-8b8a-55ee0bd0b40c" [label="0: 0.208197"];
    "1f1412f1-659b-4347-8013-f6e57e634c2b" [label="-0.02"];
    "32292eec-1e38-4eff-9700-cba03d93d7d8" [label="1.03"];
    "7cee5c66-0140-4be7-ab6e-01245b3c8199" [label="0: 0.609483"];
    "3308a031-f17e-494c-b015-9b5f3d904dba" [label="1.98"];
    "d53cb038-a0fa-4635-9a07-36d40c33d6b9" [label="0: 0.816742"];
    "adeac9bb-ef8a-4a91-a821-bade5e047d0d" [label="2.98"];
    "208da594-0e82-4b01-af38-b60bac08624d" [label="3.99"];
    "5db27cbb-29af-4987-9cd2-9217c781000d" -> "a81daf61-ab07-4e65-8b8a-55ee0bd0b40c";
    "a81daf61-ab07-4e65-8b8a-55ee0bd0b40c" -> "1f1412f1-659b-4347-8013-f6e57e634c2b";
    "a81daf61-ab07-4e65-8b8a-55ee0bd0b40c" -> "32292eec-1e38-4eff-9700-cba03d93d7d8";
    "5db27cbb-29af-4987-9cd2-9217c781000d" -> "7cee5c66-0140-4be7-ab6e-01245b3c8199";
    "7cee5c66-0140-4be7-ab6e-01245b3c8199" -> "3308a031-f17e-494c-b015-9b5f3d904dba";
    "7cee5c66-0140-4be7-ab6e-01245b3c8199" -> "d53cb038-a0fa-4635-9a07-36d40c33d6b9";
    "d53cb038-a0fa-4635-9a07-36d40c33d6b9" -> "adeac9bb-ef8a-4a91-a821-bade5e047d0d";
    "d53cb038-a0fa-4635-9a07-36d40c33d6b9" -> "208da594-0e82-4b01-af38-b60bac08624d";
}

================================================
FILE: classification_and_regression_trees/dot/ex00.dot
================================================
digraph decision_tree {
    "ccd352d8-dbf6-4f59-ae0b-c983f39e5c87" [label="0: 0.50794"];
    "46052817-27f4-4748-8f02-43ee4d2315dc" [label="-0.04"];
    "b2df42be-284f-415f-8db7-18f807450a5b" [label="1.02"];
    "ccd352d8-dbf6-4f59-ae0b-c983f39e5c87" -> "46052817-27f4-4748-8f02-43ee4d2315dc";
    "ccd352d8-dbf6-4f59-ae0b-c983f39e5c87" -> "b2df42be-284f-415f-8db7-18f807450a5b";
}

================================================
FILE: classification_and_regression_trees/dot/ex2.dot
================================================
digraph decision_tree {
    "bdbe6f68-a446-4539-8a80-860f22663afe" [label="0: 0.508542"];
    "acef94b2-b18f-4c9c-bb21-4caa8279f319" [label="0: 0.463241"];
    "31303341-5a1f-4167-83ce-22e8ea1e462f" [label="0: 0.130626"];
    "1ee2e839-eb76-48b6-8149-a694de0bc740" [label="0: 0.085111"];
    "1294ba07-97b4-44da-b30f-6dc8de1f2506" [label="0: 0.053764"];
    "bb2ea4a5-e090-43bd-a89f-47459726a906" [label="4.09"];
    "cf4e8349-aae6-4f47-b2e3-bdfad76140b9" [label="-2.54"];
    "746064bd-117c-46d6-bbbd-8f3943d2b418" [label="6.51"];
    "3c7d308c-4094-44ed-a689-23971c62ea5a" [label="0: 0.377383"];
    "8cd38fe6-64f6-48f5-b3da-1755767c9c9e" [label="0: 0.3417"];
    "8fa7515c-8c27-400c-b2eb-fa95acd8f8c0" [label="0: 0.32889"];
    "c7820d96-a919-408a-a48d-6472c6b6cbe4" [label="0: 0.300318"];
    "046be329-b0aa-49f7-b3d6-7fc216d478c9" [label="0: 0.176523"];
    "0a30919d-85b8-4c93-bfbb-f1fcff96dba6" [label="0: 0.156273"];
    "4ad75fa1-e0b7-4678-be73-5de3775d397e" [label="-6.25"];
    "398d6223-1ef9-4c45-9ac7-44f61850dc45" [label="-12.11"];
    "3a51d811-12ca-46b5-8548-f40a6571b263" [label="0: 0.203993"];
    "2460b0c3-0648-4d24-8a57-e996df15a425" [label="3.45"];
    "a595f63d-5bc6-4901-8802-749a77568531" [label="0: 0.218321"];
    "a4cedd79-a7f9-4d67-acd2-bba51563dc30" [label="-11.82"];
    "9e2b62f5-877e-4c05-82e1-8264fe924106" [label="0: 0.228628"];
    "47a064c2-9c82-4e09-a069-80e6cd741aba" [label="6.77"];
    "78788a82-c815-4891-8933-d0d2d39e8ace" [label="0: 0.264639"];
    "83554092-3407-44a5-89d9-b1954126079b" [label="-13.07"];
    "26d04a07-eb77-4691-aca6-3bf94fae2063" [label="0.40"];
    "aabebca3-28f6-4bb4-aa90-9393dfa9479c" [label="-19.99"];
    "217e5a9d-99f6-48f6-b60d-4940de06b873" [label="15.06"];
    "322a229e-98a5-4f9d-b6ce-5be216bb9b66" [label="0: 0.351478"];
    "20c1cd26-5c8f-45ef-8557-34bdc7e7e936" [label="-22.69"];
    "d142ca3c-8c45-4ff5-a975-5614ec532660" [label="-15.09"];
    "dd4807c3-e5dd-4139-85a3-673bc6384037" [label="0: 0.446196"];
    "98190fc7-96ed-4c83-b200-1305f1663d12" [label="0: 0.418943"];
    "a2389757-85a4-4e78-82e5-3ddcb540cb5f" [label="0: 0.388789"];
    "3d1b079c-d685-4810-b7f4-c88824625a53" [label="3.66"];
    "0d29f899-b06a-4c70-aa65-9fbf6794eca4" [label="-0.89"];
    "25f74f36-c5be-4b83-a233-0c555c86ead9" [label="14.38"];
    "cadc3d94-2545-4a48-be70-6915c3fa2795" [label="-12.56"];
    "a57b9084-97cc-450d-bc7d-8feefef45a82" [label="0: 0.483803"];
    "8b548b72-2bc2-499c-bb77-12b597d0e793" [label="3.43"];
    "a8a356da-2c12-474c-a463-8110c1f6b8bf" [label="12.51"];
    "e9657d5a-ad84-4e6f-acfe-6cfaa5f88780" [label="0: 0.731636"];
    "39dce041-ce2f-44d7-bdbd-45cf38da4af8" [label="0: 0.642373"];
    "9aebff68-5a03-4060-899d-345ef1fddace" [label="0: 0.618868"];
    "5e1c0e83-b7d6-48b0-9ef5-560cbca09f44" [label="0: 0.585413"];
    "7daf39a6-a515-4506-a06e-23af1edbec6a" [label="0: 0.560301"];
    "097d7583-bea8-4b59-8cf6-bb1a7ec82295" [label="0: 0.531944"];
    "f4e8defb-d0a3-4328-9fb1-835d3f85e406" [label="101.74"];
    "b5524dcb-e110-4fca-95ba-eec513d60fdb" [label="0: 0.546601"];
    "c68f0abe-5d5a-4493-8cf7-6818f9601baa" [label="110.98"];
    "6bb0d9b4-4be8-468e-a888-19c08287fce3" [label="109.39"];
    "d51fb426-e5ec-416b-b844-e069df5f6dc8" [label="97.20"];
    "09721c80-fa8e-4cec-b962-32a79348d015" [label="123.21"];
    "841fee93-2890-490a-b2c8-8bb720c32ea6" [label="93.67"];
    "c53bc2c2-dfc2-4375-9f88-9736f075c294" [label="0: 0.667851"];
    "c55165f2-d3c8-4b70-8d8c-ace5415b172f" [label="114.15"];
    "a0ae2570-2f84-4f3f-aa45-689cdc3f64cf" [label="0: 0.70889"];
    "b640a2a1-5fd8-4063-869a-ff60c957dbc2" [label="0: 0.69892"];
    "1c5d7da8-d76c-444c-aa07-c8811130765d" [label="108.93"];
    "1fe9a864-a467-4aaf-b9f1-87bfc45c25b5" [label="104.82"];
    "c7358239-c55b-413d-aa76-540e5994f84f" [label="114.55"];
    "709c2602-24e3-4f9a-bb0d-8c089399c018" [label="0: 0.953902"];
    "80f583a9-c4eb-41af-bef3-cfb7d38d41ee" [label="0: 0.763328"];
    "9c72f602-2240-4ea7-9c0b-f7269fc86618" [label="78.09"];
    "6b400587-213d-483b-8ad0-110dc6d85507" [label="0: 0.798198"];
    "73c06578-7739-4122-98e0-63cb9dc39f81" [label="102.36"];
    "8ac0c1b7-2b2d-44b4-99f4-282aae0fe308" [label="0: 0.838587"];
    "633bb485-db8e-4d5d-8d68-1fc08619a673" [label="0: 0.815215"];
    "0b343b88-8a2a-49ae-a2d7-bd624044698b" [label="88.78"];
    "7da8e0bd-b3b8-4092-a34e-1ba8cd3ffbf7" [label="81.11"];
    "d3bb79be-7b85-4c0f-b5b7-dbb6bd4c54a2" [label="0: 0.948822"];
    "f3081916-79ff-4e39-8b30-fa054670c50f" [label="0: 0.856421"];
    "829fae81-4a91-46f9-8c6f-7559dd670841" [label="95.28"];
    "4d529db0-b857-4910-a482-9e61bc87f959" [label="0: 0.912161"];
    "818b003f-8abd-4ee8-a4ea-b596111a4f77" [label="0: 0.896683"];
    "c5f07126-a0a7-4d03-8ee2-b60dd344013f" [label="0: 0.883615"];
    "f9034542-ada8-48be-9db2-e4c9b2be70de" [label="102.25"];
    "3257b4c9-7a8a-4cb7-a620-0980005ecb6d" [label="95.18"];
    "6f8ddb81-d225-41e7-b1ec-fa6185f55097" [label="104.83"];
    "b5e03953-ad6c-4a39-981a-140d7fafc46a" [label="96.45"];
    "e85378cb-074a-4d35-ae82-a9a99f9daf50" [label="87.31"];
    "dfe3dd59-018c-41ab-8046-21ed7d136b4a" [label="0: 0.960398"];
    "2d37edbb-d4d3-4b55-a593-8a2b1ab3ff25" [label="112.43"];
    "f075116f-c182-4b0d-9f7e-2dae61982c18" [label="105.25"];
    "bdbe6f68-a446-4539-8a80-860f22663afe" -> "acef94b2-b18f-4c9c-bb21-4caa8279f319";
    "acef94b2-b18f-4c9c-bb21-4caa8279f319" -> "31303341-5a1f-4167-83ce-22e8ea1e462f";
    "31303341-5a1f-4167-83ce-22e8ea1e462f" -> "1ee2e839-eb76-48b6-8149-a694de0bc740";
    "1ee2e839-eb76-48b6-8149-a694de0bc740" -> "1294ba07-97b4-44da-b30f-6dc8de1f2506";
    "1294ba07-97b4-44da-b30f-6dc8de1f2506" -> "bb2ea4a5-e090-43bd-a89f-47459726a906";
    "1294ba07-97b4-44da-b30f-6dc8de1f2506" -> "cf4e8349-aae6-4f47-b2e3-bdfad76140b9";
    "1ee2e839-eb76-48b6-8149-a694de0bc740" -> "746064bd-117c-46d6-bbbd-8f3943d2b418";
    "31303341-5a1f-4167-83ce-22e8ea1e462f" -> "3c7d308c-4094-44ed-a689-23971c62ea5a";
    "3c7d308c-4094-44ed-a689-23971c62ea5a" -> "8cd38fe6-64f6-48f5-b3da-1755767c9c9e";
    "8cd38fe6-64f6-48f5-b3da-1755767c9c9e" -> "8fa7515c-8c27-400c-b2eb-fa95acd8f8c0";
    "8fa7515c-8c27-400c-b2eb-fa95acd8f8c0" -> "c7820d96-a919-408a-a48d-6472c6b6cbe4";
    "c7820d96-a919-408a-a48d-6472c6b6cbe4" -> "046be329-b0aa-49f7-b3d6-7fc216d478c9";
    "046be329-b0aa-49f7-b3d6-7fc216d478c9" -> "0a30919d-85b8-4c93-bfbb-f1fcff96dba6";
    "0a30919d-85b8-4c93-bfbb-f1fcff96dba6" -> "4ad75fa1-e0b7-4678-be73-5de3775d397e";
    "0a30919d-85b8-4c93-bfbb-f1fcff96dba6" -> "398d6223-1ef9-4c45-9ac7-44f61850dc45";
    "046be329-b0aa-49f7-b3d6-7fc216d478c9" -> "3a51d811-12ca-46b5-8548-f40a6571b263";
    "3a51d811-12ca-46b5-8548-f40a6571b263" -> "2460b0c3-0648-4d24-8a57-e996df15a425";
    "3a51d811-12ca-46b5-8548-f40a6571b263" -> "a595f63d-5bc6-4901-8802-749a77568531";
    "a595f63d-5bc6-4901-8802-749a77568531" -> "a4cedd79-a7f9-4d67-acd2-bba51563dc30";
    "a595f63d-5bc6-4901-8802-749a77568531" -> "9e2b62f5-877e-4c05-82e1-8264fe924106";
    "9e2b62f5-877e-4c05-82e1-8264fe924106" -> "47a064c2-9c82-4e09-a069-80e6cd741aba";
    "9e2b62f5-877e-4c05-82e1-8264fe924106" -> "78788a82-c815-4891-8933-d0d2d39e8ace";
    "78788a82-c815-4891-8933-d0d2d39e8ace" -> "83554092-3407-44a5-89d9-b1954126079b";
    "78788a82-c815-4891-8933-d0d2d39e8ace" -> "26d04a07-eb77-4691-aca6-3bf94fae2063";
    "c7820d96-a919-408a-a48d-6472c6b6cbe4" -> "aabebca3-28f6-4bb4-aa90-9393dfa9479c";
    "8fa7515c-8c27-400c-b2eb-fa95acd8f8c0" -> "217e5a9d-99f6-48f6-b60d-4940de06b873";
    "8cd38fe6-64f6-48f5-b3da-1755767c9c9e" -> "322a229e-98a5-4f9d-b6ce-5be216bb9b66";
    "322a229e-98a5-4f9d-b6ce-5be216bb9b66" -> "20c1cd26-5c8f-45ef-8557-34bdc7e7e936";
    "322a229e-98a5-4f9d-b6ce-5be216bb9b66" -> "d142ca3c-8c45-4ff5-a975-5614ec532660";
    "3c7d308c-4094-44ed-a689-23971c62ea5a" -> "dd4807c3-e5dd-4139-85a3-673bc6384037";
    "dd4807c3-e5dd-4139-85a3-673bc6384037" -> "98190fc7-96ed-4c83-b200-1305f1663d12";
    "98190fc7-96ed-4c83-b200-1305f1663d12" -> "a2389757-85a4-4e78-82e5-3ddcb540cb5f";
    "a2389757-85a4-4e78-82e5-3ddcb540cb5f" -> "3d1b079c-d685-4810-b7f4-c88824625a53";
    "a2389757-85a4-4e78-82e5-3ddcb540cb5f" -> "0d29f899-b06a-4c70-aa65-9fbf6794eca4";
    "98190fc7-96ed-4c83-b200-1305f1663d12" -> "25f74f36-c5be-4b83-a233-0c555c86ead9";
    "dd4807c3-e5dd-4139-85a3-673bc6384037" -> "cadc3d94-2545-4a48-be70-6915c3fa2795";
    "acef94b2-b18f-4c9c-bb21-4caa8279f319" -> "a57b9084-97cc-450d-bc7d-8feefef45a82";
    "a57b9084-97cc-450d-bc7d-8feefef45a82" -> "8b548b72-2bc2-499c-bb77-12b597d0e793";
    "a57b9084-97cc-450d-bc7d-8feefef45a82" -> "a8a356da-2c12-474c-a463-8110c1f6b8bf";
    "bdbe6f68-a446-4539-8a80-860f22663afe" -> "e9657d5a-ad84-4e6f-acfe-6cfaa5f88780";
    "e9657d5a-ad84-4e6f-acfe-6cfaa5f88780" -> "39dce041-ce2f-44d7-bdbd-45cf38da4af8";
    "39dce041-ce2f-44d7-bdbd-45cf38da4af8" -> "9aebff68-5a03-4060-899d-345ef1fddace";
    "9aebff68-5a03-4060-899d-345ef1fddace" -> "5e1c0e83-b7d6-48b0-9ef5-560cbca09f44";
    "5e1c0e83-b7d6-48b0-9ef5-560cbca09f44" -> "7daf39a6-a515-4506-a06e-23af1edbec6a";
    "7daf39a6-a515-4506-a06e-23af1edbec6a" -> "097d7583-bea8-4b59-8cf6-bb1a7ec82295";
    "097d7583-bea8-4b59-8cf6-bb1a7ec82295" -> "f4e8defb-d0a3-4328-9fb1-835d3f85e406";
    "097d7583-bea8-4b59-8cf6-bb1a7ec82295" -> "b5524dcb-e110-4fca-95ba-eec513d60fdb";
    "b5524dcb-e110-4fca-95ba-eec513d60fdb" -> "c68f0abe-5d5a-4493-8cf7-6818f9601baa";
    "b5524dcb-e110-4fca-95ba-eec513d60fdb" -> "6bb0d9b4-4be8-468e-a888-19c08287fce3";
    "7daf39a6-a515-4506-a06e-23af1edbec6a" -> "d51fb426-e5ec-416b-b844-e069df5f6dc8";
    "5e1c0e83-b7d6-48b0-9ef5-560cbca09f44" -> "09721c80-fa8e-4cec-b962-32a79348d015";
    "9aebff68-5a03-4060-899d-345ef1fddace" -> "841fee93-2890-490a-b2c8-8bb720c32ea6";
    "39dce041-ce2f-44d7-bdbd-45cf38da4af8" -> "c53bc2c2-dfc2-4375-9f88-9736f075c294";
    "c53bc2c2-dfc2-4375-9f88-9736f075c294" -> "c55165f2-d3c8-4b70-8d8c-ace5415b172f";
    "c53bc2c2-dfc2-4375-9f88-9736f075c294" -> "a0ae2570-2f84-4f3f-aa45-689cdc3f64cf";
    "a0ae2570-2f84-4f3f-aa45-689cdc3f64cf" -> "b640a2a1-5fd8-4063-869a-ff60c957dbc2";
    "b640a2a1-5fd8-4063-869a-ff60c957dbc2" -> "1c5d7da8-d76c-444c-aa07-c8811130765d";
    "b640a2a1-5fd8-4063-869a-ff60c957dbc2" -> "1fe9a864-a467-4aaf-b9f1-87bfc45c25b5";
    "a0ae2570-2f84-4f3f-aa45-689cdc3f64cf" -> "c7358239-c55b-413d-aa76-540e5994f84f";
    "e9657d5a-ad84-4e6f-acfe-6cfaa5f88780" -> "709c2602-24e3-4f9a-bb0d-8c089399c018";
    "709c2602-24e3-4f9a-bb0d-8c089399c018" -> "80f583a9-c4eb-41af-bef3-cfb7d38d41ee";
    "80f583a9-c4eb-41af-bef3-cfb7d38d41ee" -> "9c72f602-2240-4ea7-9c0b-f7269fc86618";
    "80f583a9-c4eb-41af-bef3-cfb7d38d41ee" -> "6b400587-213d-483b-8ad0-110dc6d85507";
    "6b400587-213d-483b-8ad0-110dc6d85507" -> "73c06578-7739-4122-98e0-63cb9dc39f81";
    "6b400587-213d-483b-8ad0-110dc6d85507" -> "8ac0c1b7-2b2d-44b4-99f4-282aae0fe308";
    "8ac0c1b7-2b2d-44b4-99f4-282aae0fe308" -> "633bb485-db8e-4d5d-8d68-1fc08619a673";
    "633bb485-db8e-4d5d-8d68-1fc08619a673" -> "0b343b88-8a2a-49ae-a2d7-bd624044698b";
    "633bb485-db8e-4d5d-8d68-1fc08619a673" -> "7da8e0bd-b3b8-4092-a34e-1ba8cd3ffbf7";
    "8ac0c1b7-2b2d-44b4-99f4-282aae0fe308" -> "d3bb79be-7b85-4c0f-b5b7-dbb6bd4c54a2";
    "d3bb79be-7b85-4c0f-b5b7-dbb6bd4c54a2" -> "f3081916-79ff-4e39-8b30-fa054670c50f";
    "f3081916-79ff-4e39-8b30-fa054670c50f" -> "829fae81-4a91-46f9-8c6f-7559dd670841";
    "f3081916-79ff-4e39-8b30-fa054670c50f" -> "4d529db0-b857-4910-a482-9e61bc87f959";
    "4d529db0-b857-4910-a482-9e61bc87f959" -> "818b003f-8abd-4ee8-a4ea-b596111a4f77";
    "818b003f-8abd-4ee8-a4ea-b596111a4f77" -> "c5f07126-a0a7-4d03-8ee2-b60dd344013f";
    "c5f07126-a0a7-4d03-8ee2-b60dd344013f" -> "f9034542-ada8-48be-9db2-e4c9b2be70de";
    "c5f07126-a0a7-4d03-8ee2-b60dd344013f" -> "3257b4c9-7a8a-4cb7-a620-0980005ecb6d";
    "818b003f-8abd-4ee8-a4ea-b596111a4f77" -> "6f8ddb81-d225-41e7-b1ec-fa6185f55097";
    "4d529db0-b857-4910-a482-9e61bc87f959" -> "b5e03953-ad6c-4a39-981a-140d7fafc46a";
    "d3bb79be-7b85-4c0f-b5b7-dbb6bd4c54a2" -> "e85378cb-074a-4d35-ae82-a9a99f9daf50";
    "709c2602-24e3-4f9a-bb0d-8c089399c018" -> "dfe3dd59-018c-41ab-8046-21ed7d136b4a";
    "dfe3dd59-018c-41ab-8046-21ed7d136b4a" -> "2d37edbb-d4d3-4b55-a593-8a2b1ab3ff25";
    "dfe3dd59-018c-41ab-8046-21ed7d136b4a" -> "f075116f-c182-4b0d-9f7e-2dae61982c18";
}

================================================
FILE: classification_and_regression_trees/dot/ex2_prune.dot
================================================
digraph decision_tree {
    "c4bff19d-b75d-4b50-99e8-34f696a77644" [label="0: 0.508542"];
    "68b83894-3568-462c-a8c2-a2cfa600d44c" [label="0: 0.463241"];
    "8fb7d681-5bb9-487b-804f-592a8760babb" [label="0: 0.130626"];
    "ffb50925-bc5a-405b-aae5-15ba23da800d" [label="0: 0.085111"];
    "92c528b2-54f3-488d-a817-b99003f2be3a" [label="0.77"];
    "8f89c68f-ec97-4e89-9757-6d4d349dfb0b" [label="6.51"];
    "ca28f207-af7e-43c9-8e5d-36a5e9cd3be5" [label="0: 0.377383"];
    "14779c7f-cb2a-4452-b325-a27aef8b1af8" [label="0: 0.3417"];
    "560ae6ed-a17a-46b8-a400-9e78920cf41b" [label="0: 0.32889"];
    "288dea95-80b6-4638-8fb1-a1b657a19a73" [label="0: 0.300318"];
    "9ac61729-5c19-4e91-915b-20beb665ceaa" [label="0: 0.176523"];
    "42f1a782-408b-4c15-8bf0-13a01f968f81" [label="-9.18"];
    "5f466b9c-fe6c-4465-97af-2520e6b66286" [label="0: 0.203993"];
    "0a18504c-63a2-4af1-868c-38c591a02a60" [label="3.45"];
    "74c75b17-a2a7-4bf0-b7d0-c4e4e916a29e" [label="0: 0.218321"];
    "d1373766-52cd-4196-babe-eff0093695fe" [label="-11.82"];
    "cfdc11bb-2ae8-4807-b3f1-c8e27c6f2317" [label="0: 0.228628"];
    "480b7cfa-89e7-434d-9980-09192f674081" [label="6.77"];
    "d56fe489-c3a4-4bcd-ace9-56c0e2510ec4" [label="0: 0.264639"];
    "675438ce-b6e6-4489-8756-653ee5e37021" [label="-13.07"];
    "2a8f2806-4eac-45ea-a820-eb776a0d8689" [label="0.40"];
    "7993ad91-42aa-4b70-91db-5fbfed178b89" [label="-19.99"];
    "f55749c7-c885-4d1a-b741-c3bd5c09f2dd" [label="15.06"];
    "4f8fcefe-a83c-4aae-bff7-8ff98dec1643" [label="0: 0.351478"];
    "e87b5ff0-822f-48c0-ae82-d32f92dd2111" [label="-22.69"];
    "37a37975-c5da-4c85-9c02-d597c9041252" [label="-15.09"];
    "0521b42e-1682-418f-a005-fe3d5098505a" [label="0: 0.446196"];
    "c7762b6e-66e6-4a15-8458-f9ff2cc5cac6" [label="0: 0.418943"];
    "266b231b-6083-474d-9ba8-8dbc1e4fec60" [label="1.38"];
    "5981dfef-12f2-4ecf-83ef-fb111c6db412" [label="14.38"];
    "fa23660c-72b3-49e7-a1f6-11038e6d0c2b" [label="-12.56"];
    "c5ec6450-a9d7-4173-bca6-bb19e0eb3816" [label="0: 0.483803"];
    "f31655f3-1003-4d88-8fcb-d57d75cf914e" [label="3.43"];
    "f29cc8c5-ed1d-4b57-b16c-c657dd787edc" [label="12.51"];
    "e74f8324-62e8-4d39-bd77-75ff0f2998f4" [label="0: 0.731636"];
    "22f47068-79d7-4f15-8375-fca2b236b75f" [label="0: 0.642373"];
    "d706789e-3e3b-4a19-8c5d-f97c17092013" [label="0: 0.618868"];
    "72712f51-6089-4f84-bfac-5211971a9785" [label="0: 0.585413"];
    "ac1505e9-2492-4269-a36d-bb93042e0af7" [label="0: 0.560301"];
    "fc6c2bb4-7b7b-4d3f-b979-8ca27d29be5b" [label="0: 0.531944"];
    "74d4917b-5afd-4078-bf91-22911bf0286f" [label="101.74"];
    "eb5af9d7-1dc0-4e12-9b6f-d2de3d1dd6a7" [label="110.18"];
    "d84a0c88-9b2c-4932-9f0f-c4c52d82de76" [label="97.20"];
    "31a5cd3b-dbf9-4943-b90e-9a6d3a0a2a21" [label="123.21"];
    "0790ed24-11c3-4310-a28c-fdd2d20ebed1" [label="93.67"];
    "993e9dfe-e330-48c9-a4d5-44b6124403f0" [label="0: 0.667851"];
    "8f434490-e990-4724-98b8-53c4b2cdbed5" [label="114.15"];
    "4dc134f3-4882-45c0-bdac-eddc21bc10df" [label="0: 0.70889"];
    "48da8c69-8998-4d8d-bd99-b268ed3281b9" [label="106.88"];
    "393f82d6-a9a0-40e7-9c48-b3dc22233ef6" [label="114.55"];
    "87ee99b7-59a7-49af-816d-ac7a0dc7d0cb" [label="0: 0.953902"];
    "c5768cbc-7bae-45f3-80a0-ed03b930766b" [label="0: 0.763328"];
    "633c832d-b589-4db2-a15e-0369eee79d08" [label="78.09"];
    "e81e22e5-514c-42f7-859f-d21419008366" [label="0: 0.798198"];
    "16ac6188-471c-439d-8503-da094a54ec82" [label="102.36"];
    "0ed901ab-9dfb-4e60-b878-62b692e7dddc" [label="0: 0.838587"];
    "e6c982d1-2c92-4787-9af6-d0f25432b035" [label="84.95"];
    "7fa3f95d-e8a4-452d-bd78-dae6ad591cca" [label="0: 0.948822"];
    "37d790b6-4448-4f11-b6c4-fa117cba9992" [label="0: 0.856421"];
    "e068d29d-a65a-4c65-a5e6-3321f93076ba" [label="95.28"];
    "7031a55c-261c-486b-ad1c-c94a66412706" [label="0: 0.912161"];
    "5a91e296-ef36-4790-9e42-d191c8f5830a" [label="0: 0.896683"];
    "c63246a4-4508-41f8-b7fe-412ac81e918d" [label="98.72"];
    "0615432d-ca65-458e-971a-bfd6201edbaa" [label="104.83"];
    "64df91ab-df9b-4c98-9e0d-4f8b2d2a4794" [label="96.45"];
    "b1351d64-cf46-4654-9446-1bf4b89c0c32" [label="87.31"];
    "e0dd7e12-6468-41b5-b63b-1c9a04ab682b" [label="108.84"];
    "c4bff19d-b75d-4b50-99e8-34f696a77644" -> "68b83894-3568-462c-a8c2-a2cfa600d44c";
    "68b83894-3568-462c-a8c2-a2cfa600d44c" -> "8fb7d681-5bb9-487b-804f-592a8760babb";
    "8fb7d681-5bb9-487b-804f-592a8760babb" -> "ffb50925-bc5a-405b-aae5-15ba23da800d";
    "ffb50925-bc5a-405b-aae5-15ba23da800d" -> "92c528b2-54f3-488d-a817-b99003f2be3a";
    "ffb50925-bc5a-405b-aae5-15ba23da800d" -> "8f89c68f-ec97-4e89-9757-6d4d349dfb0b";
    "8fb7d681-5bb9-487b-804f-592a8760babb" -> "ca28f207-af7e-43c9-8e5d-36a5e9cd3be5";
    "ca28f207-af7e-43c9-8e5d-36a5e9cd3be5" -> "14779c7f-cb2a-4452-b325-a27aef8b1af8";
    "14779c7f-cb2a-4452-b325-a27aef8b1af8" -> "560ae6ed-a17a-46b8-a400-9e78920cf41b";
    "560ae6ed-a17a-46b8-a400-9e78920cf41b" -> "288dea95-80b6-4638-8fb1-a1b657a19a73";
    "288dea95-80b6-4638-8fb1-a1b657a19a73" -> "9ac61729-5c19-4e91-915b-20beb665ceaa";
    "9ac61729-5c19-4e91-915b-20beb665ceaa" -> "42f1a782-408b-4c15-8bf0-13a01f968f81";
    "9ac61729-5c19-4e91-915b-20beb665ceaa" -> "5f466b9c-fe6c-4465-97af-2520e6b66286";
    "5f466b9c-fe6c-4465-97af-2520e6b66286" -> "0a18504c-63a2-4af1-868c-38c591a02a60";
    "5f466b9c-fe6c-4465-97af-2520e6b66286" -> "74c75b17-a2a7-4bf0-b7d0-c4e4e916a29e";
    "74c75b17-a2a7-4bf0-b7d0-c4e4e916a29e" -> "d1373766-52cd-4196-babe-eff0093695fe";
    "74c75b17-a2a7-4bf0-b7d0-c4e4e916a29e" -> "cfdc11bb-2ae8-4807-b3f1-c8e27c6f2317";
    "cfdc11bb-2ae8-4807-b3f1-c8e27c6f2317" -> "480b7cfa-89e7-434d-9980-09192f674081";
    "cfdc11bb-2ae8-4807-b3f1-c8e27c6f2317" -> "d56fe489-c3a4-4bcd-ace9-56c0e2510ec4";
    "d56fe489-c3a4-4bcd-ace9-56c0e2510ec4" -> "675438ce-b6e6-4489-8756-653ee5e37021";
    "d56fe489-c3a4-4bcd-ace9-56c0e2510ec4" -> "2a8f2806-4eac-45ea-a820-eb776a0d8689";
    "288dea95-80b6-4638-8fb1-a1b657a19a73" -> "7993ad91-42aa-4b70-91db-5fbfed178b89";
    "560ae6ed-a17a-46b8-a400-9e78920cf41b" -> "f55749c7-c885-4d1a-b741-c3bd5c09f2dd";
    "14779c7f-cb2a-4452-b325-a27aef8b1af8" -> "4f8fcefe-a83c-4aae-bff7-8ff98dec1643";
    "4f8fcefe-a83c-4aae-bff7-8ff98dec1643" -> "e87b5ff0-822f-48c0-ae82-d32f92dd2111";
    "4f8fcefe-a83c-4aae-bff7-8ff98dec1643" -> "37a37975-c5da-4c85-9c02-d597c9041252";
    "ca28f207-af7e-43c9-8e5d-36a5e9cd3be5" -> "0521b42e-1682-418f-a005-fe3d5098505a";
    "0521b42e-1682-418f-a005-fe3d5098505a" -> "c7762b6e-66e6-4a15-8458-f9ff2cc5cac6";
    "c7762b6e-66e6-4a15-8458-f9ff2cc5cac6" -> "266b231b-6083-474d-9ba8-8dbc1e4fec60";
    "c7762b6e-66e6-4a15-8458-f9ff2cc5cac6" -> "5981dfef-12f2-4ecf-83ef-fb111c6db412";
    "0521b42e-1682-418f-a005-fe3d5098505a" -> "fa23660c-72b3-49e7-a1f6-11038e6d0c2b";
    "68b83894-3568-462c-a8c2-a2cfa600d44c" -> "c5ec6450-a9d7-4173-bca6-bb19e0eb3816";
    "c5ec6450-a9d7-4173-bca6-bb19e0eb3816" -> "f31655f3-1003-4d88-8fcb-d57d75cf914e";
    "c5ec6450-a9d7-4173-bca6-bb19e0eb3816" -> "f29cc8c5-ed1d-4b57-b16c-c657dd787edc";
    "c4bff19d-b75d-4b50-99e8-34f696a77644" -> "e74f8324-62e8-4d39-bd77-75ff0f2998f4";
    "e74f8324-62e8-4d39-bd77-75ff0f2998f4" -> "22f47068-79d7-4f15-8375-fca2b236b75f";
    "22f47068-79d7-4f15-8375-fca2b236b75f" -> "d706789e-3e3b-4a19-8c5d-f97c17092013";
    "d706789e-3e3b-4a19-8c5d-f97c17092013" -> "72712f51-6089-4f84-bfac-5211971a9785";
    "72712f51-6089-4f84-bfac-5211971a9785" -> "ac1505e9-2492-4269-a36d-bb93042e0af7";
    "ac1505e9-2492-4269-a36d-bb93042e0af7" -> "fc6c2bb4-7b7b-4d3f-b979-8ca27d29be5b";
    "fc6c2bb4-7b7b-4d3f-b979-8ca27d29be5b" -> "74d4917b-5afd-4078-bf91-22911bf0286f";
    "fc6c2bb4-7b7b-4d3f-b979-8ca27d29be5b" -> "eb5af9d7-1dc0-4e12-9b6f-d2de3d1dd6a7";
    "ac1505e9-2492-4269-a36d-bb93042e0af7" -> "d84a0c88-9b2c-4932-9f0f-c4c52d82de76";
    "72712f51-6089-4f84-bfac-5211971a9785" -> "31a5cd3b-dbf9-4943-b90e-9a6d3a0a2a21";
    "d706789e-3e3b-4a19-8c5d-f97c17092013" -> "0790ed24-11c3-4310-a28c-fdd2d20ebed1";
    "22f47068-79d7-4f15-8375-fca2b236b75f" -> "993e9dfe-e330-48c9-a4d5-44b6124403f0";
    "993e9dfe-e330-48c9-a4d5-44b6124403f0" -> "8f434490-e990-4724-98b8-53c4b2cdbed5";
    "993e9dfe-e330-48c9-a4d5-44b6124403f0" -> "4dc134f3-4882-45c0-bdac-eddc21bc10df";
    "4dc134f3-4882-45c0-bdac-eddc21bc10df" -> "48da8c69-8998-4d8d-bd99-b268ed3281b9";
    "4dc134f3-4882-45c0-bdac-eddc21bc10df" -> "393f82d6-a9a0-40e7-9c48-b3dc22233ef6";
    "e74f8324-62e8-4d39-bd77-75ff0f2998f4" -> "87ee99b7-59a7-49af-816d-ac7a0dc7d0cb";
    "87ee99b7-59a7-49af-816d-ac7a0dc7d0cb" -> "c5768cbc-7bae-45f3-80a0-ed03b930766b";
    "c5768cbc-7bae-45f3-80a0-ed03b930766b" -> "633c832d-b589-4db2-a15e-0369eee79d08";
    "c5768cbc-7bae-45f3-80a0-ed03b930766b" -> "e81e22e5-514c-42f7-859f-d21419008366";
    "e81e22e5-514c-42f7-859f-d21419008366" -> "16ac6188-471c-439d-8503-da094a54ec82";
    "e81e22e5-514c-42f7-859f-d21419008366" -> "0ed901ab-9dfb-4e60-b878-62b692e7dddc";
    "0ed901ab-9dfb-4e60-b878-62b692e7dddc" -> "e6c982d1-2c92-4787-9af6-d0f25432b035";
    "0ed901ab-9dfb-4e60-b878-62b692e7dddc" -> "7fa3f95d-e8a4-452d-bd78-dae6ad591cca";
    "7fa3f95d-e8a4-452d-bd78-dae6ad591cca" -> "37d790b6-4448-4f11-b6c4-fa117cba9992";
    "37d790b6-4448-4f11-b6c4-fa117cba9992" -> "e068d29d-a65a-4c65-a5e6-3321f93076ba";
    "37d790b6-4448-4f11-b6c4-fa117cba9992" -> "7031a55c-261c-486b-ad1c-c94a66412706";
    "7031a55c-261c-486b-ad1c-c94a66412706" -> "5a91e296-ef36-4790-9e42-d191c8f5830a";
    "5a91e296-ef36-4790-9e42-d191c8f5830a" -> "c63246a4-4508-41f8-b7fe-412ac81e918d";
    "5a91e296-ef36-4790-9e42-d191c8f5830a" -> "0615432d-ca65-458e-971a-bfd6201edbaa";
    "7031a55c-261c-486b-ad1c-c94a66412706" -> "64df91ab-df9b-4c98-9e0d-4f8b2d2a4794";
    "7fa3f95d-e8a4-452d-bd78-dae6ad591cca" -> "b1351d64-cf46-4654-9446-1bf4b89c0c32";
    "87ee99b7-59a7-49af-816d-ac7a0dc7d0cb" -> "e0dd7e12-6468-41b5-b63b-1c9a04ab682b";
}

================================================
FILE: classification_and_regression_trees/dot/exp2.dot
================================================
digraph decision_tree {
    "5c49cf77-b404-459e-b4fd-513a927807dc" [label="0: 0.304401"];
    "83d1a5dd-ca47-4f50-845b-387b99fa210e" [label="[3.4687793552577886, 1.1852174309187824]"];
    "dff8f3c5-1acc-4500-993a-7ab19e72d907" [label="[0.0016985569361161585, 11.964773944276974]"];
    "5c49cf77-b404-459e-b4fd-513a927807dc" -> "83d1a5dd-ca47-4f50-845b-387b99fa210e";
    "5c49cf77-b404-459e-b4fd-513a927807dc" -> "dff8f3c5-1acc-4500-993a-7ab19e72d907";
}

================================================
FILE: classification_and_regression_trees/ex0.txt
================================================
0.409175	1.883180
0.182603	0.063908
0.663687	3.042257
0.517395	2.305004
0.013643	-0.067698
0.469643	1.662809
0.725426	3.275749
0.394350	1.118077
0.507760	2.095059
0.237395	1.181912
0.057534	0.221663
0.369820	0.938453
0.976819	4.149409
0.616051	3.105444
0.413700	1.896278
0.105279	-0.121345
0.670273	3.161652
0.952758	4.135358
0.272316	0.859063
0.303697	1.170272
0.486698	1.687960
0.511810	1.979745
0.195865	0.068690
0.986769	4.052137
0.785623	3.156316
0.797583	2.950630
0.081306	0.068935
0.659753	2.854020
0.375270	0.999743
0.819136	4.048082
0.142432	0.230923
0.215112	0.816693
0.041270	0.130713
0.044136	-0.537706
0.131337	-0.339109
0.463444	2.124538
0.671905	2.708292
0.946559	4.017390
0.904176	4.004021
0.306674	1.022555
0.819006	3.657442
0.845472	4.073619
0.156258	0.011994
0.857185	3.640429
0.400158	1.808497
0.375395	1.431404
0.885807	3.935544
0.239960	1.162152
0.148640	-0.227330
0.143143	-0.068728
0.321582	0.825051
0.509393	2.008645
0.355891	0.664566
0.938633	4.180202
0.348057	0.864845
0.438898	1.851174
0.781419	2.761993
0.911333	4.075914
0.032469	0.110229
0.499985	2.181987
0.771663	3.152528
0.670361	3.046564
0.176202	0.128954
0.392170	1.062726
0.911188	3.651742
0.872288	4.401950
0.733107	3.022888
0.610239	2.874917
0.732739	2.946801
0.714825	2.893644
0.076386	0.072131
0.559009	1.748275
0.427258	1.912047
0.841875	3.710686
0.558918	1.719148
0.533241	2.174090
0.956665	3.656357
0.620393	3.522504
0.566120	2.234126
0.523258	1.859772
0.476884	2.097017
0.176408	0.001794
0.303094	1.231928
0.609731	2.953862
0.017774	-0.116803
0.622616	2.638864
0.886539	3.943428
0.148654	-0.328513
0.104350	-0.099866
0.116868	-0.030836
0.516514	2.359786
0.664896	3.212581
0.004327	0.188975
0.425559	1.904109
0.743671	3.007114
0.935185	3.845834
0.697300	3.079411
0.444551	1.939739
0.683753	2.880078
0.755993	3.063577
0.902690	4.116296
0.094491	-0.240963
0.873831	4.066299
0.991810	4.011834
0.185611	0.077710
0.694551	3.103069
0.657275	2.811897
0.118746	-0.104630
0.084302	0.025216
0.945341	4.330063
0.785827	3.087091
0.530933	2.269988
0.879594	4.010701
0.652770	3.119542
0.879338	3.723411
0.764739	2.792078
0.504884	2.192787
0.554203	2.081305
0.493209	1.714463
0.363783	0.885854
0.316465	1.028187
0.580283	1.951497
0.542898	1.709427
0.112661	0.144068
0.816742	3.880240
0.234175	0.921876
0.402804	1.979316
0.709423	3.085768
0.867298	3.476122
0.993392	3.993679
0.711580	3.077880
0.133643	-0.105365
0.052031	-0.164703
0.366806	1.096814
0.697521	3.092879
0.787262	2.987926
0.476710	2.061264
0.721417	2.746854
0.230376	0.716710
0.104397	0.103831
0.197834	0.023776
0.129291	-0.033299
0.528528	1.942286
0.009493	-0.006338
0.998533	3.808753
0.363522	0.652799
0.901386	4.053747
0.832693	4.569290
0.119002	-0.032773
0.487638	2.066236
0.153667	0.222785
0.238619	1.089268
0.208197	1.487788
0.750921	2.852033
0.183403	0.024486
0.995608	3.737750
0.151311	0.045017
0.126804	0.001238
0.983153	3.892763
0.772495	2.819376
0.784133	2.830665
0.056934	0.234633
0.425584	1.810782
0.998709	4.237235
0.707815	3.034768
0.413816	1.742106
0.217152	1.169250
0.360503	0.831165
0.977989	3.729376
0.507953	1.823205
0.920771	4.021970
0.210542	1.262939
0.928611	4.159518
0.580373	2.039114
0.841390	4.101837
0.681530	2.778672
0.292795	1.228284
0.456918	1.736620
0.134128	-0.195046
0.016241	-0.063215
0.691214	3.305268
0.582002	2.063627
0.303102	0.898840
0.622598	2.701692
0.525024	1.992909
0.996775	3.811393
0.881025	4.353857
0.723457	2.635641
0.676346	2.856311
0.254625	1.352682
0.488632	2.336459
0.519875	2.111651
0.160176	0.121726
0.609483	3.264605
0.531881	2.103446
0.321632	0.896855
0.845148	4.220850
0.012003	-0.217283
0.018883	-0.300577
0.071476	0.006014


================================================
FILE: classification_and_regression_trees/ex00.txt
================================================
0.036098	0.155096
0.993349	1.077553
0.530897	0.893462
0.712386	0.564858
0.343554	-0.371700
0.098016	-0.332760
0.691115	0.834391
0.091358	0.099935
0.727098	1.000567
0.951949	0.945255
0.768596	0.760219
0.541314	0.893748
0.146366	0.034283
0.673195	0.915077
0.183510	0.184843
0.339563	0.206783
0.517921	1.493586
0.703755	1.101678
0.008307	0.069976
0.243909	-0.029467
0.306964	-0.177321
0.036492	0.408155
0.295511	0.002882
0.837522	1.229373
0.202054	-0.087744
0.919384	1.029889
0.377201	-0.243550
0.814825	1.095206
0.611270	0.982036
0.072243	-0.420983
0.410230	0.331722
0.869077	1.114825
0.620599	1.334421
0.101149	0.068834
0.820802	1.325907
0.520044	0.961983
0.488130	-0.097791
0.819823	0.835264
0.975022	0.673579
0.953112	1.064690
0.475976	-0.163707
0.273147	-0.455219
0.804586	0.924033
0.074795	-0.349692
0.625336	0.623696
0.656218	0.958506
0.834078	1.010580
0.781930	1.074488
0.009849	0.056594
0.302217	-0.148650
0.678287	0.907727
0.180506	0.103676
0.193641	-0.327589
0.343479	0.175264
0.145809	0.136979
0.996757	1.035533
0.590210	1.336661
0.238070	-0.358459
0.561362	1.070529
0.377597	0.088505
0.099142	0.025280
0.539558	1.053846
0.790240	0.533214
0.242204	0.209359
0.152324	0.132858
0.252649	-0.055613
0.895930	1.077275
0.133300	-0.223143
0.559763	1.253151
0.643665	1.024241
0.877241	0.797005
0.613765	1.621091
0.645762	1.026886
0.651376	1.315384
0.697718	1.212434
0.742527	1.087056
0.901056	1.055900
0.362314	-0.556464
0.948268	0.631862
0.000234	0.060903
0.750078	0.906291
0.325412	-0.219245
0.726828	1.017112
0.348013	0.048939
0.458121	-0.061456
0.280738	-0.228880
0.567704	0.969058
0.750918	0.748104
0.575805	0.899090
0.507940	1.107265
0.071769	-0.110946
0.553520	1.391273
0.401152	-0.121640
0.406649	-0.366317
0.652121	1.004346
0.347837	-0.153405
0.081931	-0.269756
0.821648	1.280895
0.048014	0.064496
0.130962	0.184241
0.773422	1.125943
0.789625	0.552614
0.096994	0.227167
0.625791	1.244731
0.589575	1.185812
0.323181	0.180811
0.822443	1.086648
0.360323	-0.204830
0.950153	1.022906
0.527505	0.879560
0.860049	0.717490
0.007044	0.094150
0.438367	0.034014
0.574573	1.066130
0.536689	0.867284
0.782167	0.886049
0.989888	0.744207
0.761474	1.058262
0.985425	1.227946
0.132543	-0.329372
0.346986	-0.150389
0.768784	0.899705
0.848921	1.170959
0.449280	0.069098
0.066172	0.052439
0.813719	0.706601
0.661923	0.767040
0.529491	1.022206
0.846455	0.720030
0.448656	0.026974
0.795072	0.965721
0.118156	-0.077409
0.084248	-0.019547
0.845815	0.952617
0.576946	1.234129
0.772083	1.299018
0.696648	0.845423
0.595012	1.213435
0.648675	1.287407
0.897094	1.240209
0.552990	1.036158
0.332982	0.210084
0.065615	-0.306970
0.278661	0.253628
0.773168	1.140917
0.203693	-0.064036
0.355688	-0.119399
0.988852	1.069062
0.518735	1.037179
0.514563	1.156648
0.976414	0.862911
0.919074	1.123413
0.697777	0.827805
0.928097	0.883225
0.900272	0.996871
0.344102	-0.061539
0.148049	0.204298
0.130052	-0.026167
0.302001	0.317135
0.337100	0.026332
0.314924	-0.001952
0.269681	-0.165971
0.196005	-0.048847
0.129061	0.305107
0.936783	1.026258
0.305540	-0.115991
0.683921	1.414382
0.622398	0.766330
0.902532	0.861601
0.712503	0.933490
0.590062	0.705531
0.723120	1.307248
0.188218	0.113685
0.643601	0.782552
0.520207	1.209557
0.233115	-0.348147
0.465625	-0.152940
0.884512	1.117833
0.663200	0.701634
0.268857	0.073447
0.729234	0.931956
0.429664	-0.188659
0.737189	1.200781
0.378595	-0.296094
0.930173	1.035645
0.774301	0.836763
0.273940	-0.085713
0.824442	1.082153
0.626011	0.840544
0.679390	1.307217
0.578252	0.921885
0.785541	1.165296
0.597409	0.974770
0.014083	-0.132525
0.663870	1.187129
0.552381	1.369630
0.683886	0.999985
0.210334	-0.006899
0.604529	1.212685
0.250744	0.046297


================================================
FILE: classification_and_regression_trees/ex2.dot
================================================
digraph decision_tree {
    "e1b05249-eb8e-4afd-837c-d2f5a5299a6a" [label="0: 0.508542"];
    "b82d5e44-41de-40ec-8558-fad039b53058" [label="-2.64"];
    "0b668e3e-42eb-4735-a6ba-420826ffc809" [label="0: 0.731636"];
    "e1a950cd-cd46-4ce1-941e-c59c56377d2e" [label="107.69"];
    "b2ee8f32-0401-4b83-a2ee-3f9212b6d8a1" [label="96.32"];
    "e1b05249-eb8e-4afd-837c-d2f5a5299a6a" -> "b82d5e44-41de-40ec-8558-fad039b53058";
    "e1b05249-eb8e-4afd-837c-d2f5a5299a6a" -> "0b668e3e-42eb-4735-a6ba-420826ffc809";
    "0b668e3e-42eb-4735-a6ba-420826ffc809" -> "e1a950cd-cd46-4ce1-941e-c59c56377d2e";
    "0b668e3e-42eb-4735-a6ba-420826ffc809" -> "b2ee8f32-0401-4b83-a2ee-3f9212b6d8a1";
}

================================================
FILE: classification_and_regression_trees/ex2.txt
================================================
0.228628	-2.266273
0.965969	112.386764
0.342761	-31.584855
0.901444	87.300625
0.585413	125.295113
0.334900	18.976650
0.769043	64.041941
0.297107	-1.798377
0.901421	100.133819
0.176523	0.946348
0.710234	108.553919
0.981980	86.399637
0.085873	-10.137104
0.537834	90.995536
0.806158	62.877698
0.708890	135.416767
0.787755	118.642009
0.463241	17.171057
0.300318	-18.051318
0.815215	118.319942
0.139880	7.336784
0.068373	-15.160836
0.457563	-34.044555
0.665652	105.547997
0.084661	-24.132226
0.954711	100.935789
0.953902	130.926480
0.487381	27.729263
0.759504	81.106762
0.454312	-20.360067
0.295993	-14.988279
0.156067	7.557349
0.428582	15.224266
0.847219	76.240984
0.499171	11.924204
0.203993	-22.379119
0.548539	83.114502
0.790312	110.159730
0.937766	119.949824
0.218321	1.410768
0.223200	15.501642
0.896683	107.001620
0.582311	82.589328
0.698920	92.470636
0.823848	59.342323
0.385021	24.816941
0.061219	6.695567
0.841547	115.669032
0.763328	115.199195
0.934853	115.753994
0.222271	-9.255852
0.217214	-3.958752
0.706961	106.180427
0.888426	94.896354
0.549814	137.267576
0.107960	-1.293195
0.085111	37.820659
0.388789	21.578007
0.467383	-9.712925
0.623909	87.181863
0.373501	-8.228297
0.513332	101.075609
0.350725	-40.086564
0.716211	103.345308
0.731636	73.912028
0.273863	-9.457556
0.211633	-8.332207
0.944221	100.120253
0.053764	-13.731698
0.126833	22.891675
0.952833	100.649591
0.391609	3.001104
0.560301	82.903945
0.124723	-1.402796
0.465680	-23.777531
0.699873	115.586605
0.164134	-27.405211
0.455761	9.841938
0.508542	96.403373
0.138619	-29.087463
0.335182	2.768225
0.908629	118.513475
0.546601	96.319043
0.378965	13.583555
0.968621	98.648346
0.637999	91.656617
0.350065	-1.319852
0.632691	93.645293
0.936524	65.548418
0.310956	-49.939516
0.437652	19.745224
0.166765	-14.740059
0.571214	114.872056
0.952377	73.520802
0.665329	121.980607
0.258070	-20.425137
0.912161	85.005351
0.777582	100.838446
0.642707	82.500766
0.885676	108.045948
0.080061	2.229873
0.039914	11.220099
0.958512	135.837013
0.377383	5.241196
0.661073	115.687524
0.454375	3.043912
0.412516	-26.419289
0.854970	89.209930
0.698472	120.521925
0.465561	30.051931
0.328890	39.783113
0.309133	8.814725
0.418943	44.161493
0.553797	120.857321
0.799873	91.368473
0.811363	112.981216
0.785574	107.024467
0.949198	105.752508
0.666452	120.014736
0.652462	112.715799
0.290749	-14.391613
0.508548	93.292829
0.680486	110.367074
0.356790	-19.526539
0.199903	-3.372472
0.264926	5.280579
0.166431	-6.512506
0.370042	-32.124495
0.628061	117.628346
0.228473	19.425158
0.044737	3.855393
0.193282	18.208423
0.519150	116.176162
0.351478	-0.461116
0.872199	111.552716
0.115150	13.795828
0.324274	-13.189243
0.446196	-5.108172
0.613004	168.180746
0.533511	129.766743
0.740859	93.773929
0.667851	92.449664
0.900699	109.188248
0.599142	130.378529
0.232802	1.222318
0.838587	134.089674
0.284794	35.623746
0.130626	-39.524461
0.642373	140.613941
0.786865	100.598825
0.403228	-1.729244
0.883615	95.348184
0.910975	106.814667
0.819722	70.054508
0.798198	76.853728
0.606417	93.521396
0.108801	-16.106164
0.318309	-27.605424
0.856421	107.166848
0.842940	95.893131
0.618868	76.917665
0.531944	124.795495
0.028546	-8.377094
0.915263	96.717610
0.925782	92.074619
0.624827	105.970743
0.331364	-1.290825
0.341700	-23.547711
0.342155	-16.930416
0.729397	110.902830
0.640515	82.713621
0.228751	-30.812912
0.948822	69.318649
0.706390	105.062147
0.079632	29.420068
0.451087	-28.724685
0.833026	76.723835
0.589806	98.674874
0.426711	-21.594268
0.872883	95.887712
0.866451	94.402102
0.960398	123.559747
0.483803	5.224234
0.811602	99.841379
0.757527	63.549854
0.569327	108.435392
0.841625	60.552308
0.264639	2.557923
0.202161	-1.983889
0.055862	-3.131497
0.543843	98.362010
0.689099	112.378209
0.956951	82.016541
0.382037	-29.007783
0.131833	22.478291
0.156273	0.225886
0.000256	9.668106
0.892999	82.436686
0.206207	-12.619036
0.487537	5.149336


================================================
FILE: classification_and_regression_trees/ex2test.txt
================================================
0.421862	10.830241
0.105349	-2.241611
0.155196	21.872976
0.161152	2.015418
0.382632	-38.778979
0.017710	20.109113
0.129656	15.266887
0.613926	111.900063
0.409277	1.874731
0.807556	111.223754
0.593722	133.835486
0.953239	110.465070
0.257402	15.332899
0.645385	93.983054
0.563460	93.645277
0.408338	-30.719878
0.874394	91.873505
0.263805	-0.192752
0.411198	10.751118
0.449884	9.211901
0.646315	113.533660
0.673718	125.135638
0.805148	113.300462
0.759327	72.668572
0.519172	82.131698
0.741031	106.777146
0.030937	9.859127
0.268848	-34.137955
0.474901	-11.201301
0.588266	120.501998
0.893936	142.826476
0.870990	105.751746
0.430763	39.146258
0.057665	15.371897
0.100076	9.131761
0.980716	116.145896
0.235289	-13.691224
0.228098	16.089151
0.622248	99.345551
0.401467	-1.694383
0.960334	110.795415
0.031214	-5.330042
0.504228	96.003525
0.779660	75.921582
0.504496	101.341462
0.850974	96.293064
0.701119	102.333839
0.191551	5.072326
0.667116	92.310019
0.555584	80.367129
0.680006	132.965442
0.393899	38.605283
0.048940	-9.861871
0.963282	115.407485
0.655496	104.269918
0.576463	141.127267
0.675708	96.227996
0.853457	114.252288
0.003933	-12.182861
0.549512	97.927224
0.218967	-4.712462
0.659972	120.950439
0.008256	8.026816
0.099500	-14.318434
0.352215	-3.747546
0.874926	89.247356
0.635084	99.496059
0.039641	14.147109
0.665111	103.298719
0.156583	-2.540703
0.648843	119.333019
0.893237	95.209585
0.128807	5.558479
0.137438	5.567685
0.630538	98.462792
0.296084	-41.799438
0.632099	84.895098
0.987681	106.726447
0.744909	111.279705
0.862030	104.581156
0.080649	-7.679985
0.831277	59.053356
0.198716	26.878801
0.860932	90.632930
0.883250	92.759595
0.818003	110.272219
0.949216	115.200237
0.460078	-35.957981
0.561077	93.545761
0.863767	114.125786
0.476891	-29.774060
0.537826	81.587922
0.686224	110.911198
0.982327	119.114523
0.944453	92.033481
0.078227	30.216873
0.782937	92.588646
0.465886	2.222139
0.885024	90.247890
0.186077	7.144415
0.915828	84.010074
0.796649	115.572156
0.127821	28.933688
0.433429	6.782575
0.946796	108.574116
0.386915	-17.404601
0.561192	92.142700
0.182490	10.764616
0.878792	95.289476
0.381342	-6.177464
0.358474	-11.731754
0.270647	13.793201
0.488904	-17.641832
0.106773	5.684757
0.270112	4.335675
0.754985	75.860433
0.585174	111.640154
0.458821	12.029692
0.218017	-26.234872
0.583887	99.413850
0.923626	107.802298
0.833620	104.179678
0.870691	93.132591
0.249896	-8.618404
0.748230	109.160652
0.019365	34.048884
0.837588	101.239275
0.529251	115.514729
0.742898	67.038771
0.522034	64.160799
0.498982	3.983061
0.479439	24.355908
0.314834	-14.256200
0.753251	85.017092
0.479362	-17.480446
0.950593	99.072784
0.718623	58.080256
0.218720	-19.605593
0.664113	94.437159
0.942900	131.725134
0.314226	18.904871
0.284509	11.779346
0.004962	-14.624176
0.224087	-50.547649
0.974331	112.822725
0.894610	112.863995
0.167350	0.073380
0.753644	105.024456
0.632241	108.625812
0.314189	-6.090797
0.965527	87.418343
0.820919	94.610538
0.144107	-4.748387
0.072556	-5.682008
0.002447	29.685714
0.851007	79.632376
0.458024	-12.326026
0.627503	139.458881
0.422259	-29.827405
0.714659	63.480271
0.672320	93.608554
0.498592	37.112975
0.698906	96.282845
0.861441	99.699230
0.112425	-12.419909
0.164784	5.244704
0.481531	-18.070497
0.375482	1.779411
0.089325	-14.216755
0.036609	-6.264372
0.945004	54.723563
0.136608	14.970936
0.292285	-41.723711
0.029195	-0.660279
0.998307	100.124230
0.303928	-5.492264
0.957863	117.824392
0.815089	113.377704
0.466399	-10.249874
0.876693	115.617275
0.536121	102.997087
0.373984	-37.359936
0.565162	74.967476
0.085412	-21.449563
0.686411	64.859620
0.908752	107.983366
0.982829	98.005424
0.052766	-42.139502
0.777552	91.899340
0.374316	-3.522501
0.060231	10.008227
0.526225	87.317722
0.583872	67.104433
0.238276	10.615159
0.678747	60.624273
0.067649	15.947398
0.530182	105.030933
0.869389	104.969996
0.698410	75.460417
0.549430	82.558068


================================================
FILE: classification_and_regression_trees/exp.txt
================================================
0.529582	100.737303
0.985730	103.106872
0.797869	99.666151
0.393473	-1.773056
0.272568	-1.170222
0.758825	96.752440
0.218359	2.337347
0.926357	98.343231
0.726881	99.633009
0.805311	102.253834
0.208632	0.493174
0.184921	-2.231071
0.660135	100.139355
0.871875	96.637420
0.657182	100.345442
0.942481	97.751546
0.427843	-1.380170
0.845958	98.195303
0.878696	99.380485
0.582034	100.971036
0.118114	2.397033
0.144718	1.304535
0.576046	101.624714
0.750305	97.601324
0.518281	100.093634
0.260793	-1.361888
0.390245	-2.973759
0.963020	98.877859
0.880661	97.631997
0.291780	-1.638124
0.192903	-2.221257
0.461442	-1.074725
0.821171	99.372052
0.144557	2.589464
0.379346	0.991090
0.383822	1.832389
0.055406	-1.870700
0.084308	-0.611701
0.719578	100.087948
0.417471	-0.510292
0.477894	-3.426525
0.871228	100.307522
0.113074	-1.011079
0.409434	-0.616173
0.967141	96.551856
0.938254	97.052196
0.079989	2.083496
0.150207	1.285491
0.417339	-0.462985
0.038787	-2.237234
0.954657	102.111432
0.844894	98.350138
0.106770	-0.998182
0.247831	2.483594
0.108687	-0.920229
0.758165	98.079399
0.199978	-3.490410
0.600602	99.850119
0.026466	1.342825
0.141239	-0.949858
0.181437	-2.223725
0.352656	2.251362
0.803371	99.647157
0.677303	100.414859
0.561674	99.133372
0.497533	-3.764935
0.523327	98.452850
0.507075	103.807755
0.791978	99.414598
0.956890	95.977239
0.487927	1.199149
0.788795	100.012047
0.554283	98.522458
0.814361	97.642150
0.788940	97.399942
0.515845	102.240479
0.758538	97.461917
0.041824	-3.294141
0.341352	1.246559
0.194801	-2.285278
0.805528	99.023113
0.435762	0.361749
0.941615	100.746547
0.478234	0.791146
0.057445	-4.266792
0.510079	98.845273
0.209900	-0.861890
0.902668	101.429190
0.456602	-2.856392
0.997595	99.828241
0.048240	-0.268920
0.319531	0.896696
0.264929	-1.000487
0.432727	-4.630489
0.419828	1.260534
0.667056	99.456518
0.488173	1.574322
0.746300	100.563503
0.528660	100.736739
0.624185	99.562872
0.169411	1.809929
0.011025	4.132846
0.974164	98.706049
0.267957	0.297803
0.726093	99.381040
0.465163	-2.344545
0.993698	101.507792
0.816513	99.903496
0.398756	0.378060
0.054974	-0.588770
0.857067	100.322945
0.362328	2.551786
0.316961	-0.528283
0.167881	-0.376517
0.393776	3.658204
0.739991	100.426554
0.457949	0.857428
0.060635	2.484776
0.942634	101.254420
0.553691	102.467820
0.394694	-0.248353
0.714625	99.650556
0.273503	1.111820
0.471886	-5.665559
0.746476	98.720163
0.140209	0.471820
0.024197	-2.854251
0.521287	99.703915
0.672280	100.463227
0.380342	-0.785713
0.956380	99.482209
0.455254	1.613841
0.647551	101.591193
0.682498	98.267734
0.054839	-2.286019
0.716849	100.614510
0.217732	-2.161633
0.918885	100.260067
0.576026	101.719788
0.868511	100.669152
0.661135	97.637969
0.166334	1.374014
0.106850	-3.658050
0.768242	104.193841
0.240916	-0.368100
0.124957	2.821672
0.984335	98.571444
0.908524	101.777344
0.861217	98.656403
0.944295	100.154508
0.527278	101.052710
0.717072	100.788373
0.130227	0.115694
0.494734	-1.220681
0.498733	0.961514
0.519411	101.331622
0.712409	104.891067
0.933858	98.180299
0.266051	0.398961
0.153690	-0.657128
0.209181	1.486816
0.942699	102.187578
0.766799	100.213348
0.862578	101.816969
0.223266	2.854445
0.611394	103.428497
0.996212	98.494158
0.724945	99.098450
0.399346	0.879259
0.750510	98.729864
0.446060	0.639843
0.999913	101.502887
0.111561	3.256383
0.094755	0.170475
0.366547	0.488994
0.179924	-0.871567
0.969023	99.982789
0.941420	100.416754
0.656851	98.520940
0.983166	99.546591
0.167843	0.033922
0.316245	2.171137
0.817118	102.849575
0.173642	1.209173
0.411030	2.022640
0.265041	2.216470
0.779660	98.475428
0.059354	-0.929568
0.722092	97.974003
0.511958	101.924447
0.371938	-0.640602
0.851009	97.873330
0.375918	-5.308115
0.797332	99.763778
0.107749	-3.770092
0.156937	-0.876724
0.960447	99.597097
0.413434	2.408090
0.644257	100.453125
0.119332	-0.495588


================================================
FILE: classification_and_regression_trees/exp2.dot
================================================
digraph decision_tree {
    "c830d5ff-5d25-4637-a268-2bb63f5d4351" [label="0: 0.304401"];
    "44889deb-3d44-405b-a7cf-d8dfa5604cb9" [label="[3.4687793552577886, 1.1852174309187824]"];
    "4a419f47-2097-4b6e-b01e-047203bf4370" [label="[0.0016985569361161585, 11.964773944276974]"];
    "c830d5ff-5d25-4637-a268-2bb63f5d4351" -> "44889deb-3d44-405b-a7cf-d8dfa5604cb9";
    "c830d5ff-5d25-4637-a268-2bb63f5d4351" -> "4a419f47-2097-4b6e-b01e-047203bf4370";
}

================================================
FILE: classification_and_regression_trees/exp2.txt
================================================
0.070670	3.470829
0.534076	6.377132
0.747221	8.949407
0.668970	8.034081
0.586082	6.997721
0.764962	9.318110
0.658125	7.880333
0.346734	4.213359
0.313967	3.762496
0.601418	7.188805
0.404396	4.893403
0.154345	3.683175
0.984061	11.712928
0.597514	7.146694
0.005144	3.333150
0.142295	3.743681
0.280007	3.737376
0.542008	6.494275
0.466781	5.532255
0.706970	8.476718
0.191038	3.673921
0.756591	9.176722
0.912879	10.850358
0.524701	6.067444
0.306090	3.681148
0.429009	5.032168
0.695091	8.209058
0.984495	11.909595
0.702748	8.298454
0.551771	6.715210
0.272894	3.983313
0.014611	3.559081
0.699852	8.417306
0.309710	3.739053
0.444877	5.219649
0.717509	8.483072
0.576550	6.894860
0.284200	3.792626
0.675922	8.067282
0.304401	3.671373
0.233675	3.795962
0.453779	5.477533
0.900938	10.701447
0.502418	6.046703
0.781843	9.254690
0.226271	3.546938
0.619535	7.703312
0.519998	6.202835
0.399447	4.934647
0.785298	9.497564
0.010767	3.565835
0.696399	8.307487
0.524366	6.266060
0.396583	4.611390
0.059988	3.484805
0.946702	11.263118
0.417559	4.895128
0.609194	7.239316
0.730687	8.858371
0.586694	7.061601
0.829567	9.937968
0.964229	11.521595
0.276813	3.756406
0.987041	11.947913
0.876107	10.440538
0.747582	8.942278
0.117348	3.567821
0.188617	3.976420
0.416655	4.928907
0.192995	3.978365
0.244888	3.777018
0.806349	9.685831
0.417555	4.990148
0.233805	3.740022
0.357325	4.325355
0.190201	3.638493
0.705127	8.432886
0.336599	3.868493
0.473786	5.871813
0.384794	4.830712
0.502217	6.117244
0.788220	9.454959
0.478773	5.681631
0.064296	3.642040
0.332143	3.886628
0.618869	7.312725
0.854981	10.306697
0.570000	6.764615
0.512739	6.166836
0.112285	3.545863
0.723700	8.526944
0.192256	3.661033
0.181268	3.678579
0.196731	3.916622
0.510342	6.026652
0.263713	3.723018
0.141105	3.529595
0.150262	3.552314
0.824724	9.973690
0.588088	6.893128
0.411291	4.856380
0.763717	9.199101
0.212118	3.740024
0.264587	3.742917
0.973524	11.683243
0.250670	3.679117
0.823460	9.743861
0.253752	3.781488
0.838332	10.172180
0.501156	6.113263
0.097275	3.472367
0.667199	7.948868
0.487320	6.022060
0.654640	7.809457
0.906907	10.775188
0.821941	9.936140
0.859396	10.428255
0.078696	3.490510
0.938092	11.252471
0.998868	11.863062
0.025501	3.515624
0.451806	5.441171
0.883872	10.498912
0.583567	6.912334
0.823688	10.003723
0.891032	10.818109
0.879259	10.639263
0.163007	3.662715
0.344263	4.169705
0.796083	9.422591
0.903683	10.978834
0.050129	3.575105
0.605553	7.306014
0.628951	7.556742
0.877052	10.444055
0.829402	9.856432
0.121422	3.638276
0.721517	8.663569
0.066532	3.673471
0.996587	11.782002
0.653384	7.804568
0.739494	8.817809
0.640341	7.636812
0.337828	3.971613
0.220512	3.713645
0.368815	4.381696
0.782509	9.349428
0.645825	7.790882
0.277391	3.834258
0.092569	3.643274
0.284320	3.609353
0.344465	4.023259
0.182523	3.749195
0.385001	4.426970
0.747609	8.966676
0.188907	3.711018
0.806244	9.610438
0.014211	3.517818
0.574813	7.040672
0.714500	8.525624
0.538982	6.393940
0.384638	4.649362
0.915586	10.936577
0.883513	10.441493
0.804148	9.742851
0.466011	5.833439
0.800574	9.638874
0.654980	8.028558
0.348564	4.064616
0.978595	11.720218
0.915906	10.833902
0.285477	3.818961
0.988631	11.684010
0.531069	6.305005
0.181658	3.806995
0.039657	3.356861
0.893344	10.776799
0.355214	4.263666
0.783508	9.475445
0.039768	3.429691
0.546308	6.472749
0.786882	9.398951
0.168282	3.564189
0.374900	4.399040
0.737767	8.888536
0.059849	3.431537
0.861891	10.246888
0.597578	7.112627
0.126050	3.611641
0.074795	3.609222
0.634401	7.627416
0.831633	9.926548
0.019095	3.470285
0.396533	4.773104
0.794973	9.492009
0.889088	10.420003
0.003174	3.587139
0.176767	3.554071
0.943730	11.227731
0.758564	8.885337


================================================
FILE: classification_and_regression_trees/model_tree.py
================================================
#!/usr/bin/env python
# -*- coding: utf-8 -*-

from regression_tree import *

def linear_regression(dataset):
    ''' 获取标准线性回归系数
    '''
    dataset = np.matrix(dataset)
    # 分割数据并添加常数列
    X_ori, y = dataset[:, :-1], dataset[:, -1]
    X_ori, y = np.matrix(X_ori), np.matrix(y)
    m, n = X_ori.shape
    X = np.matrix(np.ones((m, n+1)))
    X[:, 1:] = X_ori

    # 回归系数
    w = (X.T*X).I*X.T*y
    return w, X, y

def fleaf(dataset):
    ''' 计算给定数据集的线性回归系数
    '''
    w, _, _ = linear_regression(dataset)
    return w

def ferr(dataset):
    ''' 对给定数据集进行回归并计算误差
    '''
    w, X, y = linear_regression(dataset)
    y_prime = X*w
    return np.var(y_prime - y)

def get_nodes_edges(tree, root_node=None):
    ''' 返回树中所有节点和边
    '''
    Node = namedtuple('Node', ['id', 'label'])
    Edge = namedtuple('Edge', ['start', 'end'])

    nodes, edges = [], []

    if type(tree) is not dict:
        return nodes, edges

    if root_node is None:
        label = '{}: {}'.format(tree['feat_idx'], tree['feat_val'])
        root_node = Node._make([uuid.uuid4(), label])
        nodes.append(root_node)

    for sub_tree in (tree['left'], tree['right']):
        if type(sub_tree) is dict:
            node_label = '{}: {}'.format(sub_tree['feat_idx'], sub_tree['feat_val'])
        else:
            node_label = '{}'.format(np.array(sub_tree.T).tolist()[0])
        sub_node = Node._make([uuid.uuid4(), node_label])
        nodes.append(sub_node)

        edge = Edge._make([root_node, sub_node])
        edges.append(edge)

        sub_nodes, sub_edges = get_nodes_edges(sub_tree, root_node=sub_node)
        nodes.extend(sub_nodes)
        edges.extend(sub_edges)

    return nodes, edges

def dotify(tree):
    ''' 获取树的Graphviz Dot文件的内容
    '''
    content = 'digraph decision_tree {\n'
    nodes, edges = get_nodes_edges(tree)

    for node in nodes:
        content += '    "{}" [label="{}"];\n'.format(node.id, node.label)

    for edge in edges:
        start, end = edge.start, edge.end
        content += '    "{}" -> "{}";\n'.format(start.id, end.id)
    content += '}'

    return content

def tree_predict(data, tree):
    if type(tree) is not dict:
        w = tree
        y = np.matrix(data)*w
        return y[0, 0]

    feat_idx, feat_val = tree['feat_idx'], tree['feat_val']
    if data[feat_idx+1] < feat_val:
        return tree_predict(data, tree['left'])
    else:
        return tree_predict(data, tree['right'])

if '__main__' == __name__:
    dataset = load_data('exp2.txt')
    tree = create_tree(dataset, fleaf, ferr, opt={'err_tolerance': 0.1, 'n_tolerance': 4})

    # 生成模型树dot文件
    with open('exp2.dot', 'w') as f:
        f.write(dotify(tree))

    dataset = np.array(dataset)
    # 绘制散点图
    plt.scatter(dataset[:, 0], dataset[:, 1])

    # 绘制回归曲线
    x = np.sort(dataset[:, 0])
    y = [tree_predict([1.0] + [i], tree) for i in x]
    plt.plot(x, y, c='r')
    plt.show()



================================================
FILE: classification_and_regression_trees/notebook/分段函数回归树.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from regression_tree import *"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset = load_data('ex0.txt')\n",
    "tree = create_tree(dataset, fleaf, ferr)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'feat_idx': 0,\n",
       " 'feat_val': 0.40015800000000001,\n",
       " 'left': {'feat_idx': 0,\n",
       "  'feat_val': 0.20819699999999999,\n",
       "  'left': -0.023838155555555553,\n",
       "  'right': 1.0289583666666666},\n",
       " 'right': {'feat_idx': 0,\n",
       "  'feat_val': 0.609483,\n",
       "  'left': 1.980035071428571,\n",
       "  'right': {'feat_idx': 0,\n",
       "   'feat_val': 0.81674199999999997,\n",
       "   'left': 2.9836209534883724,\n",
       "   'right': 3.9871631999999999}}}"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tree"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset = np.array(dataset)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.collections.PathCollection at 0x109c20c50>"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAW4AAAD8CAYAAABXe05zAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3X+MHOd5H/Dvc8sluWfHXLq6AtJKNBXDESuaFc86WAwO\naEOltVTqhw9WElm1+gMwIiQtikgRrqBgAaRcFbqCSOQWNdAKiZGkUhVSknugLBV0W9IQypZqjr07\n07TJwrItSiujulRcJtGtyOXe0z925zg3O+/MO7MzszN73w9A4G5vf7zDI59993mf93lFVUFERMUx\nMugBEBFRNAzcREQFw8BNRFQwDNxERAXDwE1EVDAM3EREBcPATURUMAzcREQFw8BNRFQwG9J40uuu\nu063b9+exlMTEQ2l06dP/7mqjtncN5XAvX37dszNzaXx1EREQ0lE3ra9L1MlREQFw8BNRFQwDNxE\nRAXDwE1EVDAM3EREBcPATURUMKmUAxIRBZmdr+PQsfN4r9HEDdUKpu+6BVPjtUEPqzAYuIkoU7Pz\ndTzx7TNottoAgHqjiSe+fQYAGLwtMVVCRJk6dOz8atB2NFttHDp2fkAjKh4GbiLK1HuNZqTbqRcD\nNxFl6oZqJdLt1IuBm4gyNX3XLaiUS2tuq5RLmL7rlgGNqHi4OElEmXIWIG2rSliB0ouBm4gyNzVe\nswq+rEDxx1QJEeWWqQLl0cMLmJw5jtn5+oBGNliccRNRbgVVmvQ7+y5yCoYzbiLKrbBKk7j1304K\npt5oQnHtTaAoM3gGbiLKLb8KFK849d9F3wTEVAkR5Za7AqVuCNBx6r+LvgmIgZuIBios1+xUoHgr\nTID49d83VCu+bwRR3gQGmSNnqoSIBiZKrnlqvIZnvrQLtWoFAqBWreCZL+2KFSz73QQ06Bw5Z9xE\nNDBBuWa/gGxb/x3EmSk3W22URNBWRS3ijDnquJPGwE1EA5NVrtkJ1vVGEwJAu7e3VVEpl7B3xxgO\nHTuPxw4v9KQ9/FIig86RM3AT0cAkkWt2mHLO3ty4eh7XbLXxwqkLq7e768MB+O7crI6WcXG5lci4\n42DgJqKB2btjbE3QBOItOAZtjfdLa3j5BXOnNNAvJbJpwwgq5VIiC6VxMHAT0UDMztfxyul6T9D8\n3LYtPXnisAoOU8758SOLaKv3FewEpT0uNVt49sHdA6sqYeAmolSZgq5pJvw/3voAs/P1NTnmsEZT\nphrvtuqanLYf08+dtIffcys6bxaD2iZvHbhFpARgDkBdVe9Nb0hENCyCgq5pRusERffmG9Ns2hEU\nnP1ud+6/dbSMy602llsra37uTnt4a8cd9UYTjx1ewNzbH+DpqV2GV09HlBn37wD4EYBPpDQWIhoy\nQWVzpoVJYG1QNwX4tiqe+PYZbC6PBM6ovbaOlnHgvp0AnKC8NmhXK2UcvH9nTyrGNPN+/tQFAMg0\neFttwBGRGwHcA+AP0h0OEQ2ToLK56btugRge567OCKrUaLbavtUdQUY3bghM1Xxs04aenZsn999p\nHCsAvHDqQqYNqmx3Tn4DwD8HsBJ2RyIiR9D5klPjNXxlz7aegOitzrBpNBWF82YStRY76A3ESe9k\nJTRwi8i9AN5X1dMh93tEROZEZG5paSmxARJRcYVtLX96aheefXB34DZ2Z6t7SYLmvPacABz10OKg\nTwhAtg2qbHLckwDuF5F9ADYD+ISIPK+qD7vvpKrPAXgOACYmJuLV3xDRUJkar2Hu7Q/w4pvvoK2K\nkggeuL3m20Qq7HkA80KhiXfR0v2mMX3XLZGbVo1uLOHDK/6vn+Up9aEzblV9QlVvVNXtAL4M4Lg3\naBMR+XFqtZ1a6rYqXjldj5UPdmbetirlEr6yZ5txNh+laZVTHWMK2lmfUs86biJKTdLNmKbGa8ZN\nNSMCXL+lErghZna+jsmZ42vuc3L/nbGuwxG1QVUSIgVuVf0egO+lMhIiGjppNGN66I6bVkvw3P7+\nHdsCS/L6OTE+7OzLx48sZlrPzX7cRJSaqAuANp6e2oWH92xbXawsieDhPcFBG+jvuLKw8bZV8fyp\nC3hy9kzg/ZLCwE1Eqen3wAKTp6d24a1n9uFnM/fgrWf2Wc10+5n925YkvvjmO6H3SQJz3ESUGve2\n9SSaMfVzXFjcFrJ+By+YxG1oFRUDNxGlKqjcL0og7idHDcQr//O+pnPwwkettn8PlGRKzUMxVUJE\nAxH13MZ+ctRAvDMrTa9pmldXNmQTUjnjJqJURWnrGlQqaJujDprFR539R61+8TasSgsDNxGlJk5b\n13qj2VNrPTVes8pRx02nmB5nOqLMlOvOavckUyVElJqwtq5+BPBNn9hUqMRNpxhTIgrf13zojptS\nqZaxxcBNRKkJa+vqDX5+ByK40ydhOeq4JX+mn19qtnxf8+mpXZHz5UliqoSIUhOU3vArFQw7WCGs\nIVXckr+wcfq9pk1zrLRwxk00AE7PjJv3v4bJmeOZNuHPUlh6wzmk4Kcz9+Dk/jtR63OnZdwNP2lt\nFEoLAzdRxqKWwRVZ1BK8fgNonJK/fh43KKIp7PSZmJjQubm5xJ+XaBhMzhz3/Vheq1asOtUNO3dZ\n3pZKGSJAY7nV967LvBOR06o6YXNfzriJMpZGx7xh4qRPnn1wNy5fXcHF5dbQfzKJioGbKGNpdMwb\nRv3ulBxmDNxEGSvaQtig8JOJGQM3UcaKthA2KPxkYsY6bqIBGGQNcFHE6ea3XjBwEw2JfnpV51HS\nvbyHCQM30RDot1d1XvGTiT8GbqIMpD0bjtIiddhm5usRAzdRyrKYDUfpVT2MM/P1hlUlRClLqh45\nqL+JbQUGa6OHAwM3UcqSqEcO629iWxvO2ujhwFQJUcrithp1M82UHz+yCMC+AqOfsXhz43t3jOHE\nuSXmygeAgZsoZUnUI5tmxG3VNTnqsMAZdyx+ufHnT11Y/Tlz5dli4CZKmXs2XG80URJZk1e2Obx2\nxHDGIbC2eiSsYiRubbTfjD9oHJQuBm6iDDjBzLaiwzvDNQVtx3uNpu+s+NHDC3j8yALa3YdvHS3j\nwH07I7ePtc2BM1eeDQZuooxEqbW2meG63VCtGB/TdsX8i8stTL+8Ni/uxztz31Ipo9HsPe3ca0ul\nbD1mio+BmygjUSo6os5c9+4YwwuunHOQVlsDUytA7yeDcklQHhG0VoJn/iKRhk0xhQZuEdkM4A0A\nm7r3f1lVD6Q9MKJhE6WiI+jgXD8nzi1FeowptfLY4YWeU9aBTrDfOlrG6MYNeK9bkuinsRw+K6f+\n2dRxXwZwp6reBmA3gLtFZE+6wyIaPlH6cPvdN8h7jWakKhVTaiVoPt1Ybq0e7Gs61JepkmyEBm7t\n+Kvut+Xun+QPqiRaBzaXr/2Xq1bKxj7c3p7dYW6oVjA1XsPkpz9pNY7pu26JNKN3XsP9+PJI78ga\nzRa2D/nJ9XlgtXNSREoisgDgfQD/RVXf9LnPIyIyJyJzS0tLSY+TqNCctMRFVyrh8tWVwMc4Zy8G\nzXCBtbP2F37zl/Hwnm0oBSSbt452ZsVR0tHeTwZT4zV8fLM508rzIdNlFbhVta2quwHcCODzIvJZ\nn/s8p6oTqjoxNjaW9DiJCq3fHiGm1MnW0d5Z+9NTu/DWM/vwjQd3+z5GFTh49Kz1x2bTCT1h+Wz2\nQElPpKoSVW2IyAkAdwP4QTpDIho+/fYICdo44zSf8t7uPOapV8+umenblPUBnVl20JFqNouhrOtO\nR+iMW0TGRKTa/boC4O8COJf2wIiGSXXUf9HOdLuXqWxv/OvfxaOHF4zNp6bGaxjdaD8/c1Is7t2d\npnSHzQIqz4dMh81v9HoAfywiJXQC/RFV/U66wyIaLqaNjyEbIgEAT86ewQunLqymNuqNJqZfWgSk\nU6bn5d3UYzvrrZRLeOD2Gl45Xffd3Qmgp8nU5vKIcaMQz4dMT2jgVtXvAxjPYCxEQ+uSIT1hut0x\nO19fE7QdYRth3MHalNJw12U7s3hTLt5b3+1tMgUA5ZLgYxs34FKzxW6BKePOSaIMxG2neujY+Vi1\nt97SPb+OgAfu29kTWB87vOD7fDZjaLUVH9u0AQsHvhBjxBQFD1IgykCUzTducRb3yiPSU7rn1IQD\nwfnrfnPSXIzMBgM3UQa8G2pMJXZecQLpxzdv6HneqfHa6puH02nQr9Y66o7NJMZL0TFwE2XEvaHG\nySf7nR/pFieQmuqrbWrJvW8wQRt5TOOl9DHHTZSxKCete+u3gw5UcJhmvba15O4acO9Yg1QrZS5G\nZoQzbqKMRd1F6Z6pr4QE7aC8uSmgBzWG8kvxPLxnm2++/uD9OwPHRsnhjJsoY6bdhjZNn4J2K9ZC\nSvCm77oF0y8t9pQSfnjlKmbn68bH+Z1lOfGpT0Y+/oySw8BNlLGSId1hk082lfbZLHROjdd6tr8D\naw9WsGVzMDGlh4GbKGOmHHVY7hqIf9ivw7RwyTK+YmHgJspYzZDuCGrd6tbPbDfuRiDKFy5OEmUs\n7macor82JYczbqKM9ZvuKOprU3JEbdqTRTQxMaFzc3OJPy8R0bASkdOqOmFzX6ZKiIgKhoGbiKhg\nGLiJiAqGi5O0LvkdBcYFOioKBm4aSkGBOUqTJ6I8YqqEho4TmE0H6JqaPD316tkBjJYoOs64aegE\ndd+bGq8Zt3dfXG5h/OvfRWOZZyZSvnHGTUMnrO900Pbui8st31k6UZ4wcNPQMQVm53bb7d2mHtmz\n83VMzhwPPb2GKC0M3DR0TP049u4Yw+TMcTx2eAG2J3J5Z+9h+XOiLDBw09DxO7Xlc9u24IVTF1YD\nrm2nh+ro2tNhop5eQ5QGLk7SUPKem/jY4QX4xeqSCFZUsaVSxl9evoq253SYv/po7ekwtuc2EqWJ\nM24qLNtc86Fj532DNnDt8IKPbdqAjaXe/ElrRdfMpsPy50RZYOCmQvLLNU+/tIjxr3+3J5CHzYad\nxzdbK74/dz+e/awpD9jWlQppcua41eG6gPmMR1vOIbzOTszqaBmqwKXmtXpvgD2uqT9R2royx02F\nFCWn3E/QFgB7d4yt2SJ/cbmFSrmEZx/cjanxGmbn65h+eRGtdud16o0mpl9eBMAt9JSO0FSJiNwk\nIidE5IciclZEfieLgREFySqnrABOnFsKrCR56tWzq0Hb0Wort9BTamxy3FcBPK6qtwLYA+Cfisit\n6Q6LhlVSm1f8cs1pqFUroZUkFw0np5tuJ+pXaOBW1Z+r6v/ufv2XAH4EgJ//KLIkN694a7VtNtSM\nWG66cTiLjqwkobyJlOMWke0AxgG8mcZgaLiFNX8yMbVo9dZqu/PMXpVyqee1g4wI8MyXdq0+vzvH\n7TyfsyhZrZTRaPbOrquVcs9tREmwDtwi8nEArwB4VFX/wufnjwB4BAC2bduW2ABpeISlHGbn63jq\n1bOrKYZqpYx7b7ser5yuh/bO9p5evqVShgjWdPo7dOy8dSWKex9O2Mno9952PZ4/daHnOe697Xqr\n1yKKyqocUETKAL4D4Jiq/n7Y/VkOSH6ilPCFqVUrOLn/zkiPCdpB2c9rmK4rzhhp/Ur0lHcREQB/\nCOBHNkGbyCTJBcU4W8ynxmv4yp5tsE11274Gt8FT1myqSiYB/AMAd4rIQvfPvpTHRUPIWVAs2bbm\nCxB3YXDiU59c0ziqUjb/F7B9DS5eUtZsqkr+u6qKqv5NVd3d/fN6FoOj4eIsMvazIQaIv8XcqWpx\nl+l9ZNjmLrDv281t8JQ17pykTHgP6O2Hu9ojCr+qFtNbiMJ+12PY4iVR0hi4KRN+QTOOWrUSOyBG\nyTnXfNIcQSfHu0sTidLGwE2ZSGKhzkk/BAXQIDdUK77VH4K1M2+/NIf3E4OpLJEoCwzclAnboOnl\nHHTg7sJnG0C9AX7vjrE1NeFAJ0g/cHsNJ84trXkjADplfs5tH16+GmvzEFEaGLgpE9N33dKT4w4L\n2pVyqSefPTlz3CqAPjl7Bi+curD6/PVGE6+crvsGab+A731zMEmqLp0oCgZuyoTfAl5Q0KsZgqop\n5VJvNDE5c3x1tuwO2o5mq40T55ZCN8VEyccnUdpIFBUDN2XGu4AXZ8dhUMB30iabyyPGmbxNrj2r\nXt9EcfHoMhqYOPXPYbsvm612YDtVm00xUTbO+FWfEKWNgZsGxtuatVathNZoux8Tle2mGr83h3JJ\nUPb0heUmGxoUnjlJhWVKtVQrZVy+utKzEPqVPdvw9NQuq+f2KzkEuMmG0hOlyRQDNxWW325MpxIF\nYJClYuFhwQUUd1PJeha21Zx/fzSsGLhzgLvy4uNWc1qPuDiZA0FHehEReXHGnQPD1IifKR+i9HHG\nnQPD0og/yVPciciMgTsHhqURP1M+RNlgqiQH+mnEn6fUxDClfIjyjIE7J+JUR+StGsXUR6RoKR+i\nvGOqpMDylpoYlpQPUd5xxp0TcVIeeUtN8OxFomwwcOdA3JRH1NSE+82hOlqGKnCp2Uo0wHJDDFH6\nmCrJgbgpjyipCW+p3sXlFhrNFsv2iAqIgTsHbFMes/N1TM4cx837X8PkzHEAsG6LGnaqC8v2iIqD\nqZIcMKU8RkQwO1/H1HgNs/N1TL+0iNZKp5tjvdHE9EuLOPTrt4UexQUkf/ILEQ0OZ9w5YDrVpa26\nmsI4ePTsatB2tFYUB4+etXoNm5K86mjZbsBENFCcceeAk9p4/MhizxmGTgqj0fQ/jst0u5ffKete\ncVuz21TE5GmjEFHRccadE1PjNawYImcSKQybI78uWb4JuNn0J2EPE6JkMXDnSFCzqdGy/6/KdLuf\nqfEaTu6/0xi84+xwtKmIOXj0bK42ChEVHQN3jgSV920ynGxuuj3q6wBAY/nKasWK7Ww4rCJmdr5u\nTOdwMZQontDALSLfEpH3ReQHWQxoPQs69byxbMhxG24Pe50Hbq9BPLd/eKUdOZVhmqUrOof5PvWq\nefGUPUyI4rFZnPwjAP8WwJ+kOxQCzDsPk27gdOLcEoLWIp1URtgCYtCip994vY8louhCZ9yq+gaA\nDzIYCwVIuoFTUnXdNouefraOlllVQhRTYjluEXlEROZEZG5paSmppyWXTRuu/bq2jpaNuyRt2MzU\nbWfzU+O1SG8glXIJB+7baX1/IlorscCtqs+p6oSqToyNjSX1tIRr5XTuRb6PWit9PefeHWM9OW6v\n5StXrfLczvhMqpWy1bZ8IrLDDTgFEFRyFycAzs7X8crpek+OuzwCuN8PLi63eroU+m2k8Sv3c1TK\nJRy8fycDNVGCGLgLIGrf7bBdiqaGUysqAPx3bjr9UrztZ939U/xwdk2UvNDALSIvAvgVANeJyLsA\nDqjqH6Y9sPXKL+hGqSix6e1tCvje7fYO5/5+AT8oaNeqFQZtohTYVJU8pKrXq2pZVW9k0E6PaWv4\n3h1j1hUlNjsZTYuOI4akt9N8KuqGGZb7EaWDOydzxBR0T5xbsu67bZNWMZUWuqtW3JyJeJSacZb7\nEaWHOe4cCQq6tkeC2aRVTGdDPnZ4wfc5neZTNh0GHSz3I0oPA3eOBB2ocPP+16zaofoFV7+0it8b\nwaFj5wODvjfgj4j45sU52yZKF1MlORJ0oIJtD5GgfidxXt8b9J0Ogz+duQe/9xu3+d6fs22idHHG\nnSM2M1qb+u24J62bUih+z+VUvzRbbZS646zxgASiTDBw54w76N68/zXf+6TZDtUm6HtLDp03l+Ur\nV1MbFxFdw1RJjgUdrDBIpg08zk5LnmxDlC4G7hxLuiNgUoJm/DzZhih9DNw51s9CY5rCZvw82YYo\nXcxx51zchcY0hdVzDzqVQzTsOOOmyJxPAtVKuedneUjlEA07Bm6KZWq8hoUDX8A3Htydu1QO0bBj\nqoT6ksdUDtGw44ybiKhgGLiJiAqGgZuIqGAYuImICoaBm4ioYBi4iYgKhoGbiKhgGLiJiApm6Dbg\nPDl7Bi+++Q7aqiiJ4KE7bsLTU7sSfx3nIIH3Gk1sqZQhAjSWW1bHixER9SN3gdsdEKMGwSdnz+D5\nUxdWv2+rrn6fZPD2HiTQ6B6mC1w7XgwAgzcRpSJXqRInINYbTeszFt1efPOdSLfHZTpIwMGe1ESU\nplzNuP0Cos0Ziw6/E8eDbjcJm/Xb9JtmT2oiSkuuArcp2HlvdwfW6mgZqsAlV7rCqyRiPQZvGsQv\n9XFDtYJ6SGAeEVn9pBA39UNE5CdXqRKbMxa96ZSLyy00mi0EzakfuuMmq9efna/j8SOLxlm/w+9I\nMa+2KqZfXsT0S4uxUz9ERH5yNeP2O1nF25g/LL/s5q4qcWbp9UYTJRG0VVFzzYCdNwRTWsU963dm\nzO6qkr/4qIUVz0Nb7d7narbaePTwAr72n86gXBrBpSYrUYgomlwFbm9AjJtfFgA/nbln9Xtv+sMJ\nzu40SNgbgvfTgLsP9ex8HY8eXrC4wms+vNIGYE7HEBGZWAVuEbkbwL8GUALwB6o6k9aAwhrz2+SX\nvUH24NGzxqDcbLVx8OjZwBx50HFczptCv5xxMHATUZjQHLeIlAB8E8DfA3ArgIdE5Na0BzY7X8fk\nzHHcvP81TM4cX80Lh+WXvUF2dr6+ps7aT6PZQnW09/xEoJNueeD2Gg4dO98zFiB4pl4akUiLCI1m\ni/lvIgplE1c+D+DHqvoTVb0C4E8BfDHNQQXVczsH1TrnHG4dLaNaKRvPPLStp1aF7xuCs4nHPZbH\nDi/gydnOLDsodbOxJCiV7CtaooyXiNYvm1RJDYB7B8u7AO5IZzgdpnru3z3SySNHOefQtp76UrOF\nZx/cvbqAGUQBPH/qAr6z+PPAapZma8Xqtd1Y/01EYRIrBxSRR0RkTkTmlpaW+nouU/BaUWD65cVI\n6QRTiaHf/abGazi5/07ULB8TloKJw3a8RLR+2QTuOgB3IfSN3dvWUNXnVHVCVSfGxsb6GlRQ8Gq1\nFU+9etb6uWxqrp28+JOzZ/DpJ14PnXHbqJRL2GrImwNAuSQoj6xNowQtghIROWwC958B+IyI3Cwi\nGwF8GcDRNAcVFrwuLtsv4nlz4rVqBQ/v2bbm+2e+tAtzb3+A509diLw93sv9nAfu2+n7prF1tIxD\nv3YbDv36bT3jYFUJEYURtQhUIrIPwDfQKQf8lqr+y6D7T0xM6NzcXF8D2/3UdwNTEbVqBSf337nm\ntn46C376idf7DtpJj4mI1g8ROa2qEzb3tarjVtXXAbze16giOnj/Tky/tIiWdztil1//krAeI0H6\nDdqmNEeUhVQiIhu56lXiNjVew4OfN/cY8ebBgzoL2ojSiMpLXK/FOmwiSluutry7zc7X8cpp/yDo\nN7u17Szo9zqHjp3va8btPJJb14koC7mdcZt2JJZEfBfxbDoLerk3+vi9zuSnP7lm8dAGD1EgorTl\ndsZtmimbZsY2nQW9TG8OfouMADA5c9yqVJCbaIgoTbmdcQfNlP16WvuV/YWV10VNr9jUhIeNnYio\nX7mdcfvNoB2m48yiVnCYOg0GBd5NG0ZWx7RpwwguX+3d1r53R38bkIiIguR2xu3MoE2SSEf4zaBN\n6RUnH+6uLb/iE7QB4MS5/rb8ExEFyW3gBjrB27QomEQ6Ikp6xS8fbqpDYY6biNKU21SJI86iYxS2\n6ZUowZg5biJKU65n3EC8Rcc0mIKxd9sOG0URUdpyP+MG8rFt3DTzf+D2Gk6cW2IvEiLKTCECdx7Y\nHGRMRJSFwgfuLLvv5WHmT0RU6MDdb0dAIqIiyv3iZJB+OwISERVRoQN33I6ARERFVujAHacjIBFR\n0RU6cEfZsk5ENCwKvTjJEj0iWo8KHbgBlugR0fpT6FQJEdF6xMBNRFQwDNxERAXDwE1EVDAM3ERE\nBcPATURUMKJqOoCrjycVWQLwdh9PcR2AP09oOEXA6x1+6+2aeb3RfUpVrU4aTyVw90tE5lR1YtDj\nyAqvd/itt2vm9aaLqRIiooJh4CYiKpi8Bu7nBj2AjPF6h996u2Zeb4pymeMmIiKzvM64iYjIYKCB\nW0TuFpHzIvJjEdnv8/NNInK4+/M3RWR79qNMjsX1/q6I/FBEvi8i/01EPjWIcSYl7Hpd93tARFRE\nCl2FYHO9IvIb3d/xWRH5j1mPMUkW/563icgJEZnv/pveN4hxJkVEviUi74vIDww/FxH5N92/j++L\nyOdSG4yqDuQPgBKAtwD8IoCNABYB3Oq5zz8B8O+6X38ZwOFBjTej690LYLT79W8P+/V27/cLAN4A\ncArAxKDHnfLv9zMA5gFs7X7/1wc97pSv9zkAv939+lYAPxv0uPu85r8F4HMAfmD4+T4A/xmAANgD\n4M20xjLIGffnAfxYVX+iqlcA/CmAL3ru80UAf9z9+mUAvyoikuEYkxR6vap6QlWXu9+eAnBjxmNM\nks3vFwD+BYB/BeCjLAeXApvr/U0A31TViwCgqu9nPMYk2VyvAvhE9+stAN7LcHyJU9U3AHwQcJcv\nAvgT7TgFoCoi16cxlkEG7hqAd1zfv9u9zfc+qnoVwCUAfy2T0SXP5nrdvorOu3dRhV5v96PkTar6\nWpYDS4nN7/eXAPySiJwUkVMicndmo0uezfUeBPCwiLwL4HUA/yyboQ1M1P/jsRX+BJxhJCIPA5gA\n8LcHPZa0iMgIgN8H8I8HPJQsbUAnXfIr6HyaekNEdqlqY6CjSs9DAP5IVX9PRH4ZwH8Qkc+q6sqg\nB1Z0g5xx1wHc5Pr+xu5tvvcRkQ3ofNz6f5mMLnk21wsR+TsAvgbgflW9nNHY0hB2vb8A4LMAvici\nP0MnJ3i0wAuUNr/fdwEcVdWWqv4UwP9BJ5AXkc31fhXAEQBQ1f8JYDM6PT2GldX/8SQMMnD/GYDP\niMjNIrIRncXHo577HAXwj7pf/xqA49pdBSig0OsVkXEA/x6doF3k/CcQcr2qeklVr1PV7aq6HZ2c\n/v2qOjeY4fbN5t/zLDqzbYjIdeikTn6S5SATZHO9FwD8KgCIyN9AJ3AvZTrKbB0F8A+71SV7AFxS\n1Z+n8koDXqXdh86s4y0AX+ve9nV0/gMDnV/0SwB+DOB/AfjFQY43g+v9rwD+L4CF7p+jgx5zmtfr\nue/3UOAjXupkAAAAeElEQVSqEsvfr6CTHvohgDMAvjzoMad8vbcCOIlOxckCgC8Mesx9Xu+LAH4O\noIXOp6evAvgtAL/l+v1+s/v3cSbNf8/cOUlEVDDcOUlEVDAM3EREBcPATURUMAzcREQFw8BNRFQw\nDNxERAXDwE1EVDAM3EREBfP/ATWTEFlkthKIAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x109b9d390>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.scatter(dataset[:, 0], dataset[:, 1])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 绘制树回归曲线"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[<matplotlib.lines.Line2D at 0x109cca748>]"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAD8CAYAAACMwORRAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAFhVJREFUeJzt3X2MXXWdx/H3h1JKkVIeOpTSdjogNdnKuoAjAppdVlat\n1bQm1E3ZqGBwG1lZJZqsoknV7l8krm4IRLYRYnFdRfEhI1tC2AWCGqkUKOVpMcPMnT7YJ9rSWukD\nU777xz3Dzl7u9J6598ycc+79vJKbe+45v7n3ezrTz5w593vuTxGBmZm1lxPyLsDMzLLncDcza0MO\ndzOzNuRwNzNrQw53M7M25HA3M2tDDnczszbkcDcza0MOdzOzNnRiXi88a9as6OnpyevlzcxK6Ykn\nnng5Iroajcst3Ht6etiwYUNeL29mVkqShtKM82kZM7M25HA3M2tDDnczszbkcDcza0Opw13SFElP\nSbqvzrZpku6R1C9pvaSeLIs0M7PxGc+R++eBF8bYdj2wLyIuAL4N3NJqYWZm1rxU4S5pHvBh4Ltj\nDFkGrE2W7wWukqTWyzMzs2ak7XP/V+CfgBljbJ8LbAGIiGFJ+4GzgJdbrtDMDOCpp+DnP8+7imy8\n973wgQ9M6Es0DHdJHwF2RcQTkq5s5cUkrQRWAnR3d7fyVGbWaVatgvvug3Y4KfClL+Uf7sB7gKWS\nlgAnA6dJ+veI+PioMduA+cBWSScCM4E9tU8UEWuANQC9vb2emdvM0hschI9+tH2O3idYw3PuEXFz\nRMyLiB5gBfBQTbAD9AHXJsvLkzEObzPLRgRUKuDPo0qt6c+WkbQa2BARfcCdwPcl9QN7qf4SMDPL\nxp498Kc/OdzHYVzhHhGPAI8ky6tGrT8MfCzLwszM3lCpVO8d7qn5ClUzK77Bweq9wz01h7uZFZ+P\n3MfN4W5mxVepwBlnwMyZeVdSGg53Mys+d8qMm8PdzIrP4T5uDnczKzb3uDfF4W5mxbZ7N7z6qsN9\nnBzuZlZs7pRpisPdzIptJNzPOy/XMsrG4W5mxTYS7gsW5FpG2TjczazYKhU480w47bS8KykVh7uZ\nFZs7ZZricDezYnO4N8XhbmbF5R73pjnczay4du2CQ4cc7k1wuJtZcbkNsmkNw13SyZJ+J+lpSc9J\n+kadMddJ2i1pY3L79MSUa2YdxRcwNS3NTExHgPdFxEFJU4FfS7o/Ih6rGXdPRNyYfYlm1rHc4960\nhuGeTHR9MHk4Nbl58mszm3iVCpx1FsyYkXclpZPqnLukKZI2AruAByNifZ1hV0vaJOleSfPHeJ6V\nkjZI2rB79+4WyjazjjA46FMyTUoV7hFxLCIuAuYBl0q6sGbIL4GeiHgH8CCwdoznWRMRvRHR29XV\n1UrdZtYJ3AbZtHF1y0TEK8DDwOKa9Xsi4kjy8LvAO7Mpz8w6VgQMDblTpklpumW6JJ2eLE8H3g/8\nT82YOaMeLgVeyLJIM+tAO3fC4cM+cm9Smm6ZOcBaSVOo/jL4cUTcJ2k1sCEi+oDPSVoKDAN7gesm\nqmAz6xBug2xJmm6ZTcDFddavGrV8M3BztqWZWUdzuLfEV6iaWTG5x70lDnczK6bBQZg1C049Ne9K\nSsnhbmbF5DbIljjczayYKhW3QbbA4W5mxfP669Uedx+5N83hbmbFs3MnHDnicG+Bw93MisdtkC1z\nuJtZ8TjcW+ZwN7PiGRys3rvHvWkOdzMrnkoFurrgLW/Ju5LScribWfG4DbJlDnczKx5fwNQyh7uZ\nFYt73DPhcDezYtmxA44edbi3yOFuZsUy0injcG9JmpmYTpb0O0lPS3pO0jfqjJkm6R5J/ZLWS+qZ\niGLNrAO4xz0TaY7cjwDvi4i/AC4CFku6rGbM9cC+iLgA+DZwS7ZlmlnHcLhnomG4R9XB5OHU5BY1\nw5YBa5Ple4GrJCmzKs2sc1QqMHs2TJ+edyWllmYOVZL5U58ALgBuj4j1NUPmAlsAImJY0n7gLODl\nDGs16yzDw3DoUN5VTL6XXvJRewZShXtEHAMuknQ68HNJF0bEs+N9MUkrgZUA3d3d4/1ys84RAQsX\n/t8pik6zYkXeFZReqnAfERGvSHoYWAyMDvdtwHxgq6QTgZnAnjpfvwZYA9Db21t7asfMRuzcWQ32\n5cvhstq3uDrA0qV5V1B6DcNdUhfwWhLs04H38+Y3TPuAa4HfAsuBhyLC4W3WrJEj9uuugw9/OM9K\nrKTSHLnPAdYm591PAH4cEfdJWg1siIg+4E7g+5L6gb2A/6Yya4V7va1FDcM9IjYBF9dZv2rU8mHg\nY9mWZtbB3A5oLfIVqmZF5I+8tRY53M2KyJ+KaC1yuJsVkcPdWuRwNysaf+StZcDhblY0O3bAkSMO\nd2uJw92saNwpYxlwuJsVzUi4ew5Ra4HD3axoRsJ9wYJcy7Byc7ibFU2lAmefDaecknclVmIOd7Oi\ncRukZcDhblY0DnfLgMPdrEjc424ZcbibFcn27XD0qMPdWuZwNysSt0FaRhzuZkXiC5gsIw53syJx\nj7tlpGG4S5ov6WFJz0t6TtLn64y5UtJ+SRuT26p6z2VmDVQqMHs2TJ+edyVWcmmm2RsGvhgRT0qa\nATwh6cGIeL5m3K8i4iPZl2jWQdwGaRlpeOQeEdsj4slk+Y/AC8DciS7MrCMNDjrcLRPjOucuqYfq\nfKrr62y+XNLTku6X9PYMajPrLMeOwebNDnfLRJrTMgBIOhX4KXBTRByo2fwksCAiDkpaAvwCWFjn\nOVYCKwG6u7ubLtqsLW3fDq+95jZIy0SqI3dJU6kG+w8i4me12yPiQEQcTJbXAVMlzaozbk1E9EZE\nb1dXV4ulm7UZt0FahtJ0ywi4E3ghIr41xphzknFIujR53j1ZFmrW9hzulqE0p2XeA3wCeEbSxmTd\nV4BugIi4A1gO3CBpGDgErIiImIB6zdrXSLj7lKVloGG4R8SvATUYcxtwW1ZFmXWkSgXOOcc97pYJ\nX6FqVhRug7QMOdzNiqJScaeMZcbhblYE7nG3jDnczYrgD3+A4WGHu2XG4W5WBG6DtIw53M2KwOFu\nGXO4mxXB4GD13j3ulhGHu1kRVCowZw6cfHLelVibcLibFYHbIC1jDnezIvAkHZYxh7tZ3oaHYcsW\nh7tlyuFuljf3uNsEcLib5c1tkDYBHO5meRtpg3S4W4Yc7mZ5q1RAco+7Zcrhbpa3SgXOPRemTcu7\nEmsjaabZmy/pYUnPS3pO0ufrjJGkWyX1S9ok6ZKJKdesDbkN0iZAmiP3YeCLEbEIuAz4rKRFNWM+\nBCxMbiuB72RapVk7c7jbBGgY7hGxPSKeTJb/CLwAzK0Ztgy4O6oeA06XNCfzas3ajXvcbYKkmSD7\nDZJ6gIuB9TWb5gJbRj3emqzbXvP1K6ke2dPtN48sjQhYvRp27Mi7kolx6FB1og6Hu2UsdbhLOhX4\nKXBTRBxo5sUiYg2wBqC3tzeaeQ7rML//PXz96zBzZvu+4bhgAVxxRd5VWJtJFe6SplIN9h9ExM/q\nDNkGzB/1eF6yzqw1AwPV+3XrHIBm45CmW0bAncALEfGtMYb1AZ9MumYuA/ZHxPYxxpqlNxLu55+f\nbx1mJZPmyP09wCeAZyRtTNZ9BegGiIg7gHXAEqAfeBX4VPalWkcaGIDp02H27LwrMSuVhuEeEb8G\n1GBMAJ/NqiizNwwMVI/addwfQTOr4StUrdhGwt3MxsXhbsUV4XA3a5LD3Yrr5Zfh4EGHu1kTHO5W\nXO6UMWuaw92Ky+Fu1jSHuxXXSLj70nyzcXO4W3ENDMCcOXDKKXlXYlY6DncrrsFBn5Ixa5LD3YrL\nbZBmTXO4WzEdPVr9nHOHu1lTHO5WTJs3w+uvw3nn5V2JWSk53K2Y3AZp1hKHuxWTw92sJQ53K6aB\ngerMS3M8Fa9ZMxzuVkwDA9Xz7Sf4R9SsGWlmYrpL0i5Jz46x/UpJ+yVtTG6rsi/TOo7bIM1akuaw\n6HvA4gZjfhURFyW31a2XZR0tAl56yeFu1oKG4R4RjwJ7J6EWs6p9++DAAYe7WQuyOqF5uaSnJd0v\n6e0ZPad1KnfKmLUszQTZjTwJLIiIg5KWAL8AFtYbKGklsBKgu7s7g5e2tuRwN2tZy0fuEXEgIg4m\ny+uAqZJmjTF2TUT0RkRvV1dXqy9t7Wok3H11qlnTWg53SedI1anpJV2aPOeeVp/XOtjgIJx9Npx6\nat6VmJVWw9Mykn4IXAnMkrQV+BowFSAi7gCWAzdIGgYOASsiIiasYmt/boM0a1nDcI+Iaxpsvw24\nLbOKzAYG4N3vzrsKs1Lz5X9WLMPDMDTkI3ezFjncrVi2bIFjxxzuZi1yuFuxuA3SLBMOdysWh7tZ\nJhzuViwDAzB1Ksydm3clZqXmcLdiGRiAnh6YMiXvSsxKzeFuxeIed7NMONytWBzuZplwuFtxvPIK\n7N3rcDfLgMPdimNwsHrvcDdrmcPdisPhbpYZh7sVhz/q1ywzDncrjoEBOPNMmDkz70rMSs/hbsXh\nThmzzDjcrTgc7maZcbhbMRw7BpWKw90sIw3DXdJdknZJenaM7ZJ0q6R+SZskXZJ9mdb2tm2D115z\nuJtlpOFMTMD3qM60dPcY2z8ELExu7wa+k9zbZIiAm26CF1/Mu5LW7N9fvXe4m2UizTR7j0rqOc6Q\nZcDdybypj0k6XdKciNieUY12PPv2wa23Vj9sa/bsvKtpzQc/CO96V95VmLWFNEfujcwFtox6vDVZ\n96Zwl7QSWAnQ3d2dwUsbQ0PV+29+E66+Ot9azKwwJvUN1YhYExG9EdHb1dU1mS/dvjZvrt4vWJBv\nHWZWKFmE+zZg/qjH85J1NhlGjtwd7mY2Shbh3gd8MumauQzY7/Ptk2hoCKZPh1mz8q7EzAqk4Tl3\nST8ErgRmSdoKfA2YChARdwDrgCVAP/Aq8KmJKtbq2LwZurtByrsSMyuQNN0y1zTYHsBnM6vIxmdo\nqBruZmaj+ArVstu82efbzexNHO5ldvgw7NzpcDezN3G4l9lIG6RPy5hZDYd7mbnH3czG4HAvM/e4\nm9kYHO5lNjQEJ5wAc+fmXYmZFYzDvcw2b4Zzz4WpU/OuxMwKxuFeZkNDPiVjZnU53MvMFzCZ2Rgc\n7mV17Bhs3eojdzOry+FeVjt2VKelc7ibWR0O97IaaYP0aRkzq8PhXla+gMnMjsPhXlY+cjez43C4\nl9XQEJxxBsyYkXclZlZAqcJd0mJJL0rql/TlOtuvk7Rb0sbk9unsS7X/xx/1a2bHkWYmpinA7cD7\nga3A45L6IuL5mqH3RMSNE1Cj1TM0BG99a95VmFlBpTlyvxToj4iBiDgK/AhYNrFlWUMj0+uZmdWR\nJtznAltGPd6arKt1taRNku6VND+T6qy+V16BAwd8WsbMxpTVG6q/BHoi4h3Ag8DaeoMkrZS0QdKG\n3bt3Z/TSHcidMmbWQJpw3waMPhKfl6x7Q0TsiYgjycPvAu+s90QRsSYieiOit6urq5l6DdzjbmYN\npQn3x4GFks6TdBKwAugbPUDSnFEPlwIvZFeivYkn6TCzBhp2y0TEsKQbgQeAKcBdEfGcpNXAhojo\nAz4naSkwDOwFrpvAmm1oCKZNA//1Y2ZjaBjuABGxDlhXs27VqOWbgZuzLc3GNNIpc4KvQTOz+pwO\nZeRJOsysAYd7GXmSDjNrwOFeNkeOVD/L3UfuZnYcDvey2ZJcT+ZwN7PjcLiXjS9gMrMUHO5l4wuY\nzCwFh3vZDA2BBPPm5V2JmRWYw71shoZgzhw46aS8KzGzAnO4l40n6TCzFBzuZeMLmMwsBYd7mbz+\nerUV0p0yZtaAw71Mdu6Eo0d95G5mDTncy8Qf9WtmKTncy2Skx92nZcysAYd7mfjI3cxScriXydAQ\nzJwJp52WdyVmVnCpwl3SYkkvSuqX9OU626dJuifZvl5ST9aFGu5xN7PUGoa7pCnA7cCHgEXANZIW\n1Qy7HtgXERcA3wZuybpQwz3uZpZamiP3S4H+iBiIiKPAj4BlNWOWAWuT5XuBqyQpuzIN8CQdZpZa\nmnCfC2wZ9Xhrsq7umIgYBvYDZ2VRoCX276/efORuZimkmiA7K5JWAisBups9An3gAfjCFzKsqiSO\nHq3eO9zNLIU04b4NmD/q8bxkXb0xWyWdCMwE9tQ+UUSsAdYA9Pb2RjMFc9ppsKj2lH+HuOIKuOqq\nvKswsxJIE+6PAwslnUc1xFcAf1czpg+4FvgtsBx4KCKaC+9GLr8cfvKTCXlqM7N20TDcI2JY0o3A\nA8AU4K6IeE7SamBDRPQBdwLfl9QP7KX6C8DMzHKS6px7RKwD1tWsWzVq+TDwsWxLMzOzZvkKVTOz\nNuRwNzNrQw53M7M25HA3M2tDDnczszbkcDcza0OaqGuNGr6wtBsYavLLZwEvZ1hOGXifO4P3uTO0\nss8LIqKr0aDcwr0VkjZERG/edUwm73Nn8D53hsnYZ5+WMTNrQw53M7M2VNZwX5N3ATnwPncG73Nn\nmPB9LuU5dzMzO76yHrmbmdlxFDrcJS2W9KKkfklfrrN9mqR7ku3rJfVMfpXZSrHPX5D0vKRNkv5b\nUumnZmq0z6PGXS0pJJW+syLNPkv62+R7/Zyk/5jsGrOW4me7W9LDkp5Kfr6X5FFnViTdJWmXpGfH\n2C5Jtyb/HpskXZJpARFRyBvVz45/CTgfOAl4GlhUM+YfgDuS5RXAPXnXPQn7/NfAKcnyDZ2wz8m4\nGcCjwGNAb951T8L3eSHwFHBG8vjsvOuehH1eA9yQLC8CKnnX3eI+/yVwCfDsGNuXAPcDAi4D1mf5\n+kU+cr8U6I+IgYg4CvwIWFYzZhmwNlm+F7hKkiaxxqw13OeIeDgiXk0ePkZ12sMyS/N9Bvhn4Bbg\n8GQWN0HS7PPfA7dHxD6AiNg1yTVmLc0+B3BasjwT+MMk1pe5iHiU6uRFY1kG3B1VjwGnS5qT1esX\nOdznAltGPd6arKs7JiKGgf3AWZNS3cRIs8+jXU/1N3+ZNdzn5M/V+RHxn5NZ2ARK831+G/A2Sb+R\n9JikxZNW3cRIs89fBz4uaSvVyYH+cXJKy814/7+PS6qZmKx4JH0c6AX+Ku9aJpKkE4BvAdflXMpk\nO5HqqZkrqf519qikP4+IV3KtamJdA3wvIv5F0uVUp+68MCJez7uwMirykfs2YP6ox/OSdXXHSDqR\n6p9yeyaluomRZp+R9DfAV4GlEXFkkmqbKI32eQZwIfCIpArVc5N9JX9TNc33eSvQFxGvRcQg8Huq\nYV9Wafb5euDHABHxW+Bkqp/B0q5S/X9vVpHD/XFgoaTzJJ1E9Q3TvpoxfcC1yfJy4KFI3qkoqYb7\nLOli4N+oBnvZz8NCg32OiP0RMSsieiKih+r7DEsjYkM+5WYizc/2L6getSNpFtXTNAOTWWTG0uzz\nZuAqAEl/RjXcd09qlZOrD/hk0jVzGbA/IrZn9ux5v6Pc4N3mJVSPWF4CvpqsW031PzdUv/k/AfqB\n3wHn513zJOzzfwE7gY3JrS/vmid6n2vGPkLJu2VSfp9F9XTU88AzwIq8a56EfV4E/IZqJ81G4AN5\n19zi/v4Q2A68RvUvseuBzwCfGfU9vj3593gm659rX6FqZtaGinxaxszMmuRwNzNrQw53M7M25HA3\nM2tDDnczszbkcDcza0MOdzOzNuRwNzNrQ/8LfR/vUvqwLfQAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x109c6def0>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "x = np.linspace(0, 1, 50)\n",
    "y = [tree_predict([i], tree) for i in x]\n",
    "plt.plot(x, y, c='r')"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: classification_and_regression_trees/notebook/后剪枝.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from prune import *"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 加载数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "data = load_data('ex2.txt')\n",
    "tree = create_tree(data, fleaf, ferr)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 判断树结构"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "False"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "not_tree(tree)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 对树结构进行塌陷处理"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "53.136107929136443"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "collapse(tree)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 输出树结构"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'feat_idx': 0,\n",
       " 'feat_val': 0.50854200000000005,\n",
       " 'left': {'feat_idx': 0,\n",
       "  'feat_val': 0.46324100000000001,\n",
       "  'left': {'feat_idx': 0,\n",
       "   'feat_val': 0.13062599999999999,\n",
       "   'left': {'feat_idx': 0,\n",
       "    'feat_val': 0.085111000000000006,\n",
       "    'left': {'feat_idx': 0,\n",
       "     'feat_val': 0.053763999999999999,\n",
       "     'left': 4.0916259999999998,\n",
       "     'right': -2.5443927142857148},\n",
       "    'right': 6.5098432857142843},\n",
       "   'right': {'feat_idx': 0,\n",
       "    'feat_val': 0.37738300000000002,\n",
       "    'left': {'feat_idx': 0,\n",
       "     'feat_val': 0.3417,\n",
       "     'left': {'feat_idx': 0,\n",
       "      'feat_val': 0.32889000000000002,\n",
       "      'left': {'feat_idx': 0,\n",
       "       'feat_val': 0.30031799999999997,\n",
       "       'left': {'feat_idx': 0,\n",
       "        'feat_val': 0.17652300000000001,\n",
       "        'left': {'feat_idx': 0,\n",
       "         'feat_val': 0.156273,\n",
       "         'left': -6.2479000000000013,\n",
       "         'right': -12.107972500000001},\n",
       "        'right': {'feat_idx': 0,\n",
       "         'feat_val': 0.20399300000000001,\n",
       "         'left': 3.4496025000000001,\n",
       "         'right': {'feat_idx': 0,\n",
       "          'feat_val': 0.21832099999999999,\n",
       "          'left': -11.822278500000001,\n",
       "          'right': {'feat_idx': 0,\n",
       "           'feat_val': 0.228628,\n",
       "           'left': 6.770429,\n",
       "           'right': {'feat_idx': 0,\n",
       "            'feat_val': 0.26463900000000001,\n",
       "            'left': -13.070501,\n",
       "            'right': 0.40377471428571476}}}}},\n",
       "       'right': -19.994155200000002},\n",
       "      'right': 15.059290750000001},\n",
       "     'right': {'feat_idx': 0,\n",
       "      'feat_val': 0.35147800000000001,\n",
       "      'left': -22.693879600000002,\n",
       "      'right': -15.085111749999999}},\n",
       "    'right': {'feat_idx': 0,\n",
       "     'feat_val': 0.44619599999999998,\n",
       "     'left': {'feat_idx': 0,\n",
       "      'feat_val': 0.41894300000000001,\n",
       "      'left': {'feat_idx': 0,\n",
       "       'feat_val': 0.388789,\n",
       "       'left': 3.6584772500000016,\n",
       "       'right': -0.89235549999999952},\n",
       "      'right': 14.38417875},\n",
       "     'right': -12.558604833333334}}},\n",
       "  'right': {'feat_idx': 0,\n",
       "   'feat_val': 0.48380299999999998,\n",
       "   'left': 3.4331330000000007,\n",
       "   'right': 12.50675925}},\n",
       " 'right': {'feat_idx': 0,\n",
       "  'feat_val': 0.73163599999999995,\n",
       "  'left': {'feat_idx': 0,\n",
       "   'feat_val': 0.64237299999999997,\n",
       "   'left': {'feat_idx': 0,\n",
       "    'feat_val': 0.61886799999999997,\n",
       "    'left': {'feat_idx': 0,\n",
       "     'feat_val': 0.58541299999999996,\n",
       "     'left': {'feat_idx': 0,\n",
       "      'feat_val': 0.56030100000000005,\n",
       "      'left': {'feat_idx': 0,\n",
       "       'feat_val': 0.53194399999999997,\n",
       "       'left': 101.73699325000001,\n",
       "       'right': {'feat_idx': 0,\n",
       "        'feat_val': 0.546601,\n",
       "        'left': 110.979946,\n",
       "        'right': 109.38961049999999}},\n",
       "      'right': 97.200180249999988},\n",
       "     'right': 123.2101316},\n",
       "    'right': 93.673449714285724},\n",
       "   'right': {'feat_idx': 0,\n",
       "    'feat_val': 0.66785099999999997,\n",
       "    'left': 114.15162428571431,\n",
       "    'right': {'feat_idx': 0,\n",
       "     'feat_val': 0.70889000000000002,\n",
       "     'left': {'feat_idx': 0,\n",
       "      'feat_val': 0.69891999999999999,\n",
       "      'left': 108.92921799999999,\n",
       "      'right': 104.82495374999999},\n",
       "     'right': 114.554706}}},\n",
       "  'right': {'feat_idx': 0,\n",
       "   'feat_val': 0.95390200000000003,\n",
       "   'left': {'feat_idx': 0,\n",
       "    'feat_val': 0.76332800000000001,\n",
       "    'left': 78.085643250000004,\n",
       "    'right': {'feat_idx': 0,\n",
       "     'feat_val': 0.79819799999999996,\n",
       "     'left': 102.35780185714285,\n",
       "     'right': {'feat_idx': 0,\n",
       "      'feat_val': 0.83858699999999997,\n",
       "      'left': {'feat_idx': 0,\n",
       "       'feat_val': 0.81521500000000002,\n",
       "       'left': 88.784498800000009,\n",
       "       'right': 81.110151999999999},\n",
       "      'right': {'feat_idx': 0,\n",
       "       'feat_val': 0.94882200000000005,\n",
       "       'left': {'feat_idx': 0,\n",
       "        'feat_val': 0.85642099999999999,\n",
       "        'left': 95.275843166666661,\n",
       "        'right': {'feat_idx': 0,\n",
       "         'feat_val': 0.912161,\n",
       "         'left': {'feat_idx': 0,\n",
       "          'feat_val': 0.89668300000000001,\n",
       "          'left': {'feat_idx': 0,\n",
       "           'feat_val': 0.88361500000000004,\n",
       "           'left': 102.25234449999999,\n",
       "           'right': 95.181792999999999},\n",
       "          'right': 104.82540899999999},\n",
       "         'right': 96.452866999999998}},\n",
       "       'right': 87.310387500000004}}}},\n",
       "   'right': {'feat_idx': 0,\n",
       "    'feat_val': 0.96039799999999997,\n",
       "    'left': 112.42895575000001,\n",
       "    'right': 105.24862350000001}}}}"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tree"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 使用测试数据进行后剪枝"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "data_test = load_data('ex2test.txt')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "merged\n",
      "merged\n",
      "merged\n",
      "merged\n",
      "merged\n",
      "merged\n",
      "merged\n",
      "merged\n"
     ]
    }
   ],
   "source": [
    "pruned_tree = postprune(tree, data_test)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'feat_idx': 0,\n",
       " 'feat_val': 0.50854200000000005,\n",
       " 'left': {'feat_idx': 0,\n",
       "  'feat_val': 0.46324100000000001,\n",
       "  'left': {'feat_idx': 0,\n",
       "   'feat_val': 0.13062599999999999,\n",
       "   'left': {'feat_idx': 0,\n",
       "    'feat_val': 0.085111000000000006,\n",
       "    'left': 0.77361664285714249,\n",
       "    'right': 6.5098432857142843},\n",
       "   'right': {'feat_idx': 0,\n",
       "    'feat_val': 0.37738300000000002,\n",
       "    'left': {'feat_idx': 0,\n",
       "     'feat_val': 0.3417,\n",
       "     'left': {'feat_idx': 0,\n",
       "      'feat_val': 0.32889000000000002,\n",
       "      'left': {'feat_idx': 0,\n",
       "       'feat_val': 0.30031799999999997,\n",
       "       'left': {'feat_idx': 0,\n",
       "        'feat_val': 0.17652300000000001,\n",
       "        'left': -9.1779362500000019,\n",
       "        'right': {'feat_idx': 0,\n",
       "         'feat_val': 0.20399300000000001,\n",
       "         'left': 3.4496025000000001,\n",
       "         'right': {'feat_idx': 0,\n",
       "          'feat_val': 0.21832099999999999,\n",
       "          'left': -11.822278500000001,\n",
       "          'right': {'feat_idx': 0,\n",
       "           'feat_val': 0.228628,\n",
       "           'left': 6.770429,\n",
       "           'right': {'feat_idx': 0,\n",
       "            'feat_val': 0.26463900000000001,\n",
       "            'left': -13.070501,\n",
       "            'right': 0.40377471428571476}}}}},\n",
       "       'right': -19.994155200000002},\n",
       "      'right': 15.059290750000001},\n",
       "     'right': {'feat_idx': 0,\n",
       "      'feat_val': 0.35147800000000001,\n",
       "      'left': -22.693879600000002,\n",
       "      'right': -15.085111749999999}},\n",
       "    'right': {'feat_idx': 0,\n",
       "     'feat_val': 0.44619599999999998,\n",
       "     'left': {'feat_idx': 0,\n",
       "      'feat_val': 0.41894300000000001,\n",
       "      'left': 1.3830608750000011,\n",
       "      'right': 14.38417875},\n",
       "     'right': -12.558604833333334}}},\n",
       "  'right': {'feat_idx': 0,\n",
       "   'feat_val': 0.48380299999999998,\n",
       "   'left': 3.4331330000000007,\n",
       "   'right': 12.50675925}},\n",
       " 'right': {'feat_idx': 0,\n",
       "  'feat_val': 0.73163599999999995,\n",
       "  'left': {'feat_idx': 0,\n",
       "   'feat_val': 0.64237299999999997,\n",
       "   'left': {'feat_idx': 0,\n",
       "    'feat_val': 0.61886799999999997,\n",
       "    'left': {'feat_idx': 0,\n",
       "     'feat_val': 0.58541299999999996,\n",
       "     'left': {'feat_idx': 0,\n",
       "      'feat_val': 0.56030100000000005,\n",
       "      'left': {'feat_idx': 0,\n",
       "       'feat_val': 0.53194399999999997,\n",
       "       'left': 101.73699325000001,\n",
       "       'right': 110.18477824999999},\n",
       "      'right': 97.200180249999988},\n",
       "     'right': 123.2101316},\n",
       "    'right': 93.673449714285724},\n",
       "   'right': {'feat_idx': 0,\n",
       "    'feat_val': 0.66785099999999997,\n",
       "    'left': 114.15162428571431,\n",
       "    'right': {'feat_idx': 0,\n",
       "     'feat_val': 0.70889000000000002,\n",
       "     'left': 106.87708587499999,\n",
       "     'right': 114.554706}}},\n",
       "  'right': {'feat_idx': 0,\n",
       "   'feat_val': 0.95390200000000003,\n",
       "   'left': {'feat_idx': 0,\n",
       "    'feat_val': 0.76332800000000001,\n",
       "    'left': 78.085643250000004,\n",
       "    'right': {'feat_idx': 0,\n",
       "     'feat_val': 0.79819799999999996,\n",
       "     'left': 102.35780185714285,\n",
       "     'right': {'feat_idx': 0,\n",
       "      'feat_val': 0.83858699999999997,\n",
       "      'left': 84.947325400000011,\n",
       "      'right': {'feat_idx': 0,\n",
       "       'feat_val': 0.94882200000000005,\n",
       "       'left': {'feat_idx': 0,\n",
       "        'feat_val': 0.85642099999999999,\n",
       "        'left': 95.275843166666661,\n",
       "        'right': {'feat_idx': 0,\n",
       "         'feat_val': 0.912161,\n",
       "         'left': {'feat_idx': 0,\n",
       "          'feat_val': 0.89668300000000001,\n",
       "          'left': 98.717068749999996,\n",
       "          'right': 104.82540899999999},\n",
       "         'right': 96.452866999999998}},\n",
       "       'right': 87.310387500000004}}}},\n",
       "   'right': 108.838789625}}}"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pruned_tree"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 生成树结构dot文件用于显示"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "with open('ex2_prune.dot', 'w') as f:\n",
    "    content = dotify(pruned_tree)\n",
    "    f.write(content)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[1m\u001b[36m__pycache__\u001b[m\u001b[m/                 ex2test.txt\r\n",
      "\u001b[1m\u001b[36mdot\u001b[m\u001b[m/                         \u001b[1m\u001b[36mpic\u001b[m\u001b[m/\r\n",
      "ex0.txt                      prune.py\r\n",
      "ex00.txt                     regression_tree.py\r\n",
      "ex2.txt                      后剪枝.ipynb\r\n",
      "ex2_prune.dot                分段函数回归树.ipynb\r\n"
     ]
    }
   ],
   "source": [
    "ls"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: classification_and_regression_trees/notebook/模型树对分段线性函数进行回归.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from model_tree import *"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 加载数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset = load_data('exp2.txt')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 创建模型树"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'feat_idx': 0, 'feat_val': 0.30440099999999998, 'left': matrix([[ 3.46877936],\n",
       "         [ 1.18521743]]), 'right': matrix([[  1.69855694e-03],\n",
       "         [  1.19647739e+01]])}"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tree = create_tree(dataset, fleaf, ferr, opt={'err_tolerance': 0.1, 'n_tolerance': 4})\n",
    "tree"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 绘制回归曲线"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXwAAAD8CAYAAAB0IB+mAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAADU9JREFUeJzt3GGI5Hd9x/H3xztTaYym9FaQu9Ok9NJ42ELSJU0Raoq2\nXPLg7oFF7iBYJXhgGylVhBRLlPjIhloQrtWTilXQGH0gC57cA40ExAu3ITV4FyLb03oXhawxzZOg\nMe23D2bSna53mX92Z3cv+32/4GD+//ntzJcfe++dndmZVBWSpO3vFVs9gCRpcxh8SWrC4EtSEwZf\nkpow+JLUhMGXpCamBj/JZ5M8meT7l7g+ST6ZZCnJo0lunP2YkqT1GvII/3PAgRe5/lZg3/jfUeBf\n1j+WJGnWpga/qh4Efv4iSw4Bn6+RU8DVSV4/qwElSbOxcwa3sRs4P3F8YXzup6sXJjnK6LcArrzy\nyj+8/vrrZ3D3ktTHww8//LOqmlvL184i+INV1XHgOMD8/HwtLi5u5t1L0stekv9c69fO4q90ngD2\nThzvGZ+TJF1GZhH8BeBd47/WuRl4pqp+7ekcSdLWmvqUTpIvAbcAu5JcAD4CvBKgqj4FnABuA5aA\nZ4H3bNSwkqS1mxr8qjoy5foC/npmE0mSNoTvtJWkJgy+JDVh8CWpCYMvSU0YfElqwuBLUhMGX5Ka\nMPiS1ITBl6QmDL4kNWHwJakJgy9JTRh8SWrC4EtSEwZfkpow+JLUhMGXpCYMviQ1YfAlqQmDL0lN\nGHxJasLgS1ITBl+SmjD4ktSEwZekJgy+JDVh8CWpCYMvSU0YfElqwuBLUhMGX5KaMPiS1ITBl6Qm\nDL4kNWHwJamJQcFPciDJ40mWktx1kevfkOSBJI8keTTJbbMfVZK0HlODn2QHcAy4FdgPHEmyf9Wy\nvwfur6obgMPAP896UEnS+gx5hH8TsFRV56rqOeA+4NCqNQW8Znz5tcBPZjeiJGkWhgR/N3B+4vjC\n+NykjwK3J7kAnADef7EbSnI0yWKSxeXl5TWMK0laq1m9aHsE+FxV7QFuA76Q5Nduu6qOV9V8Vc3P\nzc3N6K4lSUMMCf4TwN6J4z3jc5PuAO4HqKrvAq8Cds1iQEnSbAwJ/mlgX5Jrk1zB6EXZhVVrfgy8\nDSDJmxgF3+dsJOkyMjX4VfU8cCdwEniM0V/jnElyT5KD42UfBN6b5HvAl4B3V1Vt1NCSpJdu55BF\nVXWC0Yuxk+funrh8FnjLbEeTJM2S77SVpCYMviQ1YfAlqQmDL0lNGHxJasLgS1ITBl+SmjD4ktSE\nwZekJgy+JDVh8CWpCYMvSU0YfElqwuBLUhMGX5KaMPiS1ITBl6QmDL4kNWHwJakJgy9JTRh8SWrC\n4EtSEwZfkpow+JLUhMGXpCYMviQ1YfAlqQmDL0lNGHxJasLgS1ITBl+SmjD4ktSEwZekJgy+JDUx\nKPhJDiR5PMlSkrsuseadSc4mOZPki7MdU5K0XjunLUiyAzgG/BlwATidZKGqzk6s2Qf8HfCWqno6\nyes2amBJ0toMeYR/E7BUVeeq6jngPuDQqjXvBY5V1dMAVfXkbMeUJK3XkODvBs5PHF8Yn5t0HXBd\nku8kOZXkwMVuKMnRJItJFpeXl9c2sSRpTWb1ou1OYB9wC3AE+EySq1cvqqrjVTVfVfNzc3MzumtJ\n0hBDgv8EsHfieM/43KQLwEJV/aqqfgj8gNEPAEnSZWJI8E8D+5Jcm+QK4DCwsGrN1xg9uifJLkZP\n8Zyb4ZySpHWaGvyqeh64EzgJPAbcX1VnktyT5OB42UngqSRngQeAD1XVUxs1tCTppUtVbckdz8/P\n1+Li4pbctyS9XCV5uKrm1/K1vtNWkpow+JLUhMGXpCYMviQ1YfAlqQmDL0lNGHxJasLgS1ITBl+S\nmjD4ktSEwZekJgy+JDVh8CWpCYMvSU0YfElqwuBLUhMGX5KaMPiS1ITBl6QmDL4kNWHwJakJgy9J\nTRh8SWrC4EtSEwZfkpow+JLUhMGXpCYMviQ1YfAlqQmDL0lNGHxJasLgS1ITBl+SmjD4ktSEwZek\nJgYFP8mBJI8nWUpy14use0eSSjI/uxElSbMwNfhJdgDHgFuB/cCRJPsvsu4q4G+Ah2Y9pCRp/YY8\nwr8JWKqqc1X1HHAfcOgi6z4GfBz4xQznkyTNyJDg7wbOTxxfGJ/7P0luBPZW1ddf7IaSHE2ymGRx\neXn5JQ8rSVq7db9om+QVwCeAD05bW1XHq2q+qubn5ubWe9eSpJdgSPCfAPZOHO8Zn3vBVcCbgW8n\n+RFwM7DgC7eSdHkZEvzTwL4k1ya5AjgMLLxwZVU9U1W7quqaqroGOAUcrKrFDZlYkrQmU4NfVc8D\ndwIngceA+6vqTJJ7khzc6AElSbOxc8iiqjoBnFh17u5LrL1l/WNJkmbNd9pKUhMGX5KaMPiS1ITB\nl6QmDL4kNWHwJakJgy9JTRh8SWrC4EtSEwZfkpow+JLUhMGXpCYMviQ1YfAlqQmDL0lNGHxJasLg\nS1ITBl+SmjD4ktSEwZekJgy+JDVh8CWpCYMvSU0YfElqwuBLUhMGX5KaMPiS1ITBl6QmDL4kNWHw\nJakJgy9JTRh8SWrC4EtSEwZfkpoYFPwkB5I8nmQpyV0Xuf4DSc4meTTJN5O8cfajSpLWY2rwk+wA\njgG3AvuBI0n2r1r2CDBfVX8AfBX4h1kPKklanyGP8G8ClqrqXFU9B9wHHJpcUFUPVNWz48NTwJ7Z\njilJWq8hwd8NnJ84vjA+dyl3AN+42BVJjiZZTLK4vLw8fEpJ0rrN9EXbJLcD88C9F7u+qo5X1XxV\nzc/Nzc3yriVJU+wcsOYJYO/E8Z7xuf8nyduBDwNvrapfzmY8SdKsDHmEfxrYl+TaJFcAh4GFyQVJ\nbgA+DRysqidnP6Ykab2mBr+qngfuBE4CjwH3V9WZJPckOThedi/wauArSf49ycIlbk6StEWGPKVD\nVZ0ATqw6d/fE5bfPeC5J0oz5TltJasLgS1ITBl+SmjD4ktSEwZekJgy+JDVh8CWpCYMvSU0YfElq\nwuBLUhMGX5KaMPiS1ITBl6QmDL4kNWHwJakJgy9JTRh8SWrC4EtSEwZfkpow+JLUhMGXpCYMviQ1\nYfAlqQmDL0lNGHxJasLgS1ITBl+SmjD4ktSEwZekJgy+JDVh8CWpCYMvSU0YfElqwuBLUhMGX5Ka\nGBT8JAeSPJ5kKcldF7n+N5J8eXz9Q0mumfWgkqT1mRr8JDuAY8CtwH7gSJL9q5bdATxdVb8L/BPw\n8VkPKklanyGP8G8ClqrqXFU9B9wHHFq15hDwb+PLXwXeliSzG1OStF47B6zZDZyfOL4A/NGl1lTV\n80meAX4b+NnkoiRHgaPjw18m+f5aht6GdrFqrxpzL1a4FyvcixW/t9YvHBL8mamq48BxgCSLVTW/\nmfd/uXIvVrgXK9yLFe7FiiSLa/3aIU/pPAHsnTjeMz530TVJdgKvBZ5a61CSpNkbEvzTwL4k1ya5\nAjgMLKxaswD85fjyXwDfqqqa3ZiSpPWa+pTO+Dn5O4GTwA7gs1V1Jsk9wGJVLQD/CnwhyRLwc0Y/\nFKY5vo65txv3YoV7scK9WOFerFjzXsQH4pLUg++0laQmDL4kNbHhwfdjGVYM2IsPJDmb5NEk30zy\nxq2YczNM24uJde9IUkm27Z/kDdmLJO8cf2+cSfLFzZ5xswz4P/KGJA8keWT8/+S2rZhzoyX5bJIn\nL/VepYx8crxPjya5cdANV9WG/WP0Iu9/AL8DXAF8D9i/as1fAZ8aXz4MfHkjZ9qqfwP34k+B3xxf\nfl/nvRivuwp4EDgFzG/13Fv4fbEPeAT4rfHx67Z67i3ci+PA+8aX9wM/2uq5N2gv/gS4Efj+Ja6/\nDfgGEOBm4KEht7vRj/D9WIYVU/eiqh6oqmfHh6cYvedhOxryfQHwMUafy/SLzRxukw3Zi/cCx6rq\naYCqenKTZ9wsQ/aigNeML78W+MkmzrdpqupBRn/xeCmHgM/XyCng6iSvn3a7Gx38i30sw+5Lramq\n54EXPpZhuxmyF5PuYPQTfDuauhfjX1H3VtXXN3OwLTDk++I64Lok30lyKsmBTZtucw3Zi48Ctye5\nAJwA3r85o112XmpPgE3+aAUNk+R2YB5461bPshWSvAL4BPDuLR7lcrGT0dM6tzD6re/BJL9fVf+1\npVNtjSPA56rqH5P8MaP3/7y5qv5nqwd7OdjoR/h+LMOKIXtBkrcDHwYOVtUvN2m2zTZtL64C3gx8\nO8mPGD1HubBNX7gd8n1xAVioql9V1Q+BHzD6AbDdDNmLO4D7Aarqu8CrGH2wWjeDerLaRgffj2VY\nMXUvktwAfJpR7Lfr87QwZS+q6pmq2lVV11TVNYxezzhYVWv+0KjL2JD/I19j9OieJLsYPcVzbjOH\n3CRD9uLHwNsAkryJUfCXN3XKy8MC8K7xX+vcDDxTVT+d9kUb+pRObdzHMrzsDNyLe4FXA18Zv279\n46o6uGVDb5CBe9HCwL04Cfx5krPAfwMfqqpt91vwwL34IPCZJH/L6AXcd2/HB4hJvsToh/yu8esV\nHwFeCVBVn2L0+sVtwBLwLPCeQbe7DfdKknQRvtNWkpow+JLUhMGXpCYMviQ1YfAlqQmDL0lNGHxJ\nauJ/Acz2XLpusNoKAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x10a4d8668>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "fig = plt.figure()\n",
    "ax = fig.add_subplot(111)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 绘制散点图"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.collections.PathCollection at 0x10a5270b8>"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dataset = np.array(dataset)\n",
    "ax.scatter(dataset[:, 0], dataset[:, 1])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 绘制回归曲线"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[<matplotlib.lines.Line2D at 0x10a518710>]"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = np.sort(np.array(dataset[:, 0]))\n",
    "y = [tree_predict([1.0] + [i], tree) for i in x]\n",
    "ax.plot(x, y, c='r')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD8CAYAAABn919SAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xd809XixvHPaZtCy2qFOiggyg8XoIJVUa4DFyoyZC/F\nieO6FcQrlCUyKlec140gXKQMK7jQizguCooWRESuojKCSEEKQkPpOL8/0oSkTWmhadqkz/v1QtLk\nfJPztfBweqax1iIiIuEvqqorICIiwaFAFxGJEAp0EZEIoUAXEYkQCnQRkQihQBcRiRAKdBGRCKFA\nFxGJEAp0EZEIERPKD2vUqJFt3rx5KD9SRCTsffPNNzustUlllQtpoDdv3pyVK1eG8iNFRMKeMWZj\necqpy0VEJEIo0EVEIoQCXUQkQijQRUQiRJmBbox5zRiz3Rjzvc9zacaYH40x3xlj3jLGJFRuNUVE\npCzlaaG/DlxZ7LmPgNbW2tOB/wGPBLleIiJymMqctmit/cwY07zYcx/6fLkc6BXcaomIhI8RGWuY\nvWIzBdYSbQz9z23KY93bhLwewZiHfhMwp7QXjTFDgCEAzZo1C8LHiYhUHyMy1jBz+Sbv1wXWMnP5\nJrbNW8QP9RtjmjVjaKeT6d42udLrUqFBUWPMo0A+MKu0Mtbal6y1KdbalKSkMhc6iYiEldkrNpd4\nrtvapfxrdirDP5mGM9vFIwvWkJHprPS6HHGgG2NuAK4BBlqdNC0iNVRBsfi7+esMnnpnCl83acU/\nOt0FgCuvgLTF6yu9LkfU5WKMuRIYBlxkrc0JbpVERMJHtDHuULeWYZ9N587l83jvpPO5v8tD5MbE\nesttzXZVel3KDHRjzGzgYqCRMWYLMAr3rJZawEfGGIDl1trbK7GeIiLVTkamk5hoAwfyefyDZ+m7\n5iNmnnkVqZffTmFUtF/ZxglxlV6f8sxy6R/g6VcroS4iImEhI9PJ6IVryXblUSsvlxcWTubyn1cw\ntUN/pnYYAO6Grp+hnU6u9HqFdLdFEZFwl5Hp5JEFa3DlFVB//15enj+Os7f8wIjL72Bmu84Br0mM\nd4RklosCXUTkMKQtXo8rr4Cj/9rJ9LmjaLFzC3d3Hca7p14QsHycI5pRXVqFpG4KdBGRcsjIdJK2\neD3ObBcn/OlkRnoqia493Nh7NMuan1nqdRN6tAlJ6xwU6CIiZfLtZmm97WdenzsKgP79HmfNcS1L\nvS45IS5kYQ7abVFEpEyebpYOv63izdmPsD+mFr0HTj5kmMc5okMyEOpLLXQRkTJszXbRed3nPPnO\nFDY0bMLg3mPYXq8h4G6Fe4I7bfF6tma7aFz0XChb56BAFxEp09/XfcgDC59hZZNTuaVnKntq1wXc\nYb5s+CXecqEO8OIU6CIipbEWxozhoYVPs+Sk9tx5zVByHbWAqulSKYsCXUSkiO9MFoctZNSH/2LQ\nqvfZ2K0vex+dRKMlG6q0S6UsCnQREfxnssTm5/HkO0/Qef0ynm/fi2daD2ZCTIxf90p1pFkuIiIc\nnMlSNzeH1+eOovP6ZYy75BYmX3QDrvzCkOyWWFFqoYuI4J7J0mjfLl6fO5qTs37jvmseJKNVR7/X\nqzsFuogIkFK4i7SZwzh635/c2mMkn7RI8Xs9FLslVpQCXURk9WremPYgrv0uBvYdT2byKX4vG0Kz\nW2JFqQ9dRGq2Tz+FCy+kdu1YZkx+g1UBwnxg+2bVbkZLIGqhi0hE80xFDDjd8K23oH9/OPFEWLyY\ne5s25fhDla/mTCiPA01JSbErV64M2eeJSM3mOxXRwxFlqFs7hk5fvsP4xc+xu/WZHLX0Q2jYsApr\nemjGmG+stSlllVOXi4hELM9URF95BYX0XzKLiR88w2cntOWyq0aSsWl/FdUwuNTlIiIRxbeLpXj/\ng7GFpC55mRu/WcSCVh0ZdtW95JsY0havD5tulUNRoItIxAjUxeLhKMjjiXen0m3dp7yS0o3xl9yM\nNe5OCmcYzDEvDwW6iESMQF0sAPEHXLzw1uNc+FsmEy+6gRfO7el3kHN0gEOdw5ECXUQiRqDVnIk5\nu5k2bwxttv3M0KvuYe7pV5QoUxDCySGVSYEuImHP029ePJaTd29nRnoqyXu2c9u1j/KflucGvD45\nDFaBlocCXUTCWmn95i2zNjIjPZU6efu5rs9Yvm7aOuD1jmgTFqtAy0OBLiJhLVC/ebst63ht/hhy\nY2LpPXAS65OaB7w2Md7BqC6tImKGCyjQRSTMFe8377jha57PmMjv9Rpyfd9xbGlwjN/r0cYwpc8Z\nERPivrSwSETCmu8uiD2+X8LL88fxU6OmDL5xCjsbNfYrG+eIjtgwB7XQRSSMBNqXpXnDOJzZLm5d\nsYBHP3mNz48/k9uv/QcuRzwDzkpm6Y9ZYbkvy5HQXi4iEhYCDX5GGSgstAz/ZBq3f7WARadcwIOd\nH+BAjANwTzV/ss+ZYR/i5d3LRS10EQkLgQY/o/LzmfzBM/T6fgnT23VmzKVDKIyK9r5uLTyyYA1A\n2Id6eagPXUTCQvHl+bXz9vPiW+Pp9f0S/vm3gYy67Ha/MPdw5RWExXmgwaAWuohUexmZTgx4Fw41\ncP3Fq/PH0s75I49ecSez2l59yOvD4TzQYFCgi0i157sK9Ng9O5g+N5Xmu7ZyZ/fhfHByhzKvD4fz\nQINBgS4i1U7x2Sye7pYWOzczPT2VBvv3ckPvsXx5/OllvlecIzpiVoKWRYEuItVK8dkszmwXBjh9\n63qmzRtDgYmi34CJrD2mhfcaA/w6sbP3+nA9Qq6iFOgiUq0Ems3yt1+/5YW3HmdHnQSu7zOWjYn+\nC4Z8u1S6t02uMQFeXJmzXIwxrxljthtjvvd57ihjzEfGmJ+Kfk+s3GqKSE1RfACz6w+f8uq8sWxM\nPI6bbn6yRJjXpC6VspRn2uLrwJXFnhsOLLHWtgSWFH0tInLYMjKddJj4MScMf5cOEz+mQZzD+9oN\nKxfy9KI0vk0+hfvumMqSJ/oxte+ZJCfEYXBvezuhR5sa2yIvrswuF2vtZ8aY5sWe7gZcXPR4OvAJ\n8HAQ6yUiNUCg/nKAKCz3fzaTu7+cw+KW7RnW8xHGdDsLqNldKmU50j70Y6y1vxc93gYcc6jCIiKB\nBOovjy4sYNyHzzNg9WJmn34F/+rzEGOuPk0hXg4VHhS11lpjTKkbwhhjhgBDAJo1a1bRjxORCFJ8\n9Wet/AM8vXAynX5azjPn9eXpi68jTWFebke69P8PY8xxAEW/by+toLX2JWttirU2JSkp6Qg/TkQi\nke/hzPVy9zE9PZVOPy1n1GW3MeXC68grpMYs2w+GIw30hcDgoseDgbeDUx0RqUk8hzMn7f2TOf8e\nTjvnj9zTZSjTz+riLVNTlu0HQ5ldLsaY2bgHQBsZY7YAo4CJQLox5mZgI9CnMispIuEv0IKf5IQ4\nYn7dwBtzRtIwZzc390rl8xPa+V1XU5btB0N5Zrn0L+WlS4NcFxGJUIFms9w/ZxWdDmxl3MxhRNtC\nBvQbz+rGJeeTa455+WmlqIhUutEL15aYzdJ+43ekLRjH7tp1ub7POH5p2KTEdYPaN9OA6GFQoItI\npcrIdJLtyvN77sr1y3hqURq/JTbm+j5j+aNeoxLXTe0b/icNhZoOuBCRSlV8lsqAVe/zfMZE1hzb\nkj4DJgUM8+SEOIX5EVALXUQqlXeWirXc88WbPPDfWSxpcTZ/7/Yw+x21S5TX3ixHToEuIkGVkelk\nzKK17Mo52M0SVVjAqCUvMfjbd5nX+lKGX3k3+dEH4yc5Ia5GbncbbAp0EQmajEwnQ+etJq/g4OLx\n2Pw8/vnuP7nmx8954ZweTLz4RvBZUJScEMey4ZdURXUjjgJdRIImbfF6vzCvk5vDi2+N528bVzP+\n4pt4+dwefuXVvRJcCnQRCRrfVZ0N92Uzbd5oTvvjFx7ofD8LWvsvXUlW90rQKdBF5IgVX/0ZHxvN\nvgMFNMnexoz0VI77aye39hzJ0hZnl7hW3SzBp0AXkSNS2l7mJ2f9xoz0VGrlH2Bg38f4tsmpJa5N\n8DnEQoJH89BF5IgEWv2ZsmUt6bMephBD7wGTAoa5I8owumurUFWzRlELXUQOS0amk9EL15ZY/XnZ\nTyt4duEknPWP5rq+Y9la/+gS16rfvHIp0EWk3Ip3s3j0/u4jJnzwDN8f24Ibe41mV3yDEtf+NrFz\nqKpZYynQRaTcShwZZy23r5jP8E9f57Pmbbn92n+QE1tyu9tkbYEbEgp0ESk332mJxhby6MevcsvK\nt3n71It4qPN9xMbVJs7iF/qaax46CnQRKbfGCXE4s13EFOQz+f2n6LF2KdPO6sLYS2+ldqyD8de2\nAShxkIX6zENDgS4iZRr48pcs2/AnAHEH9vOvjAlc/Os3TL7wep5v35vEOrGM6tLKG9wK8KqhQBeR\nQ/IN8wTXHqbNHcPp237i4Svv5r8XdWeqWuDVhgJdRPx4Vn86s11EG+M9yPm4PVnMSE+lWfY27uj+\nCB+edB4m2+Xd71yhXvUU6CLiVXxaoifMW+zYzBvpI6mbm8P1fcayopm7r9ziXiH6yII1gEK9qmml\nqIh4lZiWCLR1/si8WcNwFObTd+BEb5j7cuUVlDiZSEJPgS4iXr7TEgEu3rCSWXMeZXftuvQY9ATr\njj6x3NdK6CnQRcSrsc8CoG5rl/LygnH8clQTeg2azOaEY8t9rVQNBbqIAO7+85wD+QDc/HUGT70z\nha+btKJf/wnsqJPoLZcQ5yDOEe13rRYPVQ8aFBWRg4OhB/J5+NPp3LFiHu+ddD73d3mI3JhYb7k4\nR7R3p0QtHqp+FOgiQtri9RzIPcCkD56l75qPmHnmVaRefjv169SmUa2YgMGtAK9+FOgiNZhnzvmO\nrGxeWDiZy39ewdQO/ZnaYQAYw25XHqtGXVHV1ZRyUqCL1EC+e5rX37+XN+aPJWXLOkZcfgcz2x3c\n5jYhXicLhRMFukgNkpHpZMyitezKcR9OcfRfO5k+dxQtdm7h7q7DePfUC/zKF60rkjChQBepIYqv\nAj3hTycz0lNJdO3hxt6jWdb8zBLX7C52KpFUbwp0kRrCdxVo620/8/rcUQD07/c4a45rGfAazS0P\nL5qHLlJDOItWcnb4bRVvzn6E/TG16D1wcqlhrrnl4UctdJEaICPTiQGuXvc5T74zhQ0NmzC49xi2\n12sYsHxivMNvf3MJDwp0kRogbfF6Bn37DmM+epGVTU7llp6p7Kldt0S5ZC0SCmsKdJFIZy19Fr3C\nvV/M5qP/O5e7ug4j11GrRLHfJnYOcLGEEwW6SCQrKIC//517v5jNnDaX848r76IgKrpEsWQNfkYE\nBbpIhPGu/tyxmxcWP0nH7z/n+fa9mHzhYDCmRHkNfkaOCgW6MeZ+4BbcB5esAW601u4PRsVE5NA8\nwe3ZZ6XjKUm8s/p3sl151M3NYdqCxzh/03dMuHwIczr0hABzyqONYUKPNuozjxBHPG3RGJMM3AOk\nWGtbA9FAv2BVTERK51kk5Mx2eY+Bm7l8E9muPBrt28Wbsx/h7C1rufeaB3mxXVeMIeCWt1P6nKEw\njyAVnYceA8QZY2KAeGBrxaskImUJdFQcQNPsbcybOYwT/9zCrT1G8narjgBk5+QxoUcbkhPiMLj7\nzNUyjzxH3OVirXUaY54ANgEu4ENr7YdBq5mIlCrQcW+nbv+F6emjcBTkM7DveDKTT/G+1jghju5t\nkxXgEa4iXS6JQDfgBKAxUMcYMyhAuSHGmJXGmJVZWVlHXlMR8Sq+JP/cTWuYM2s4+VHR9Bo42S/M\nNehZc1Sky+Uy4FdrbZa1Ng9YAJxfvJC19iVrbYq1NiUpKakCHyciHs0bHgz0Tv/7ghnpqfxRryE9\nB6WxoVFT72uJ8Q51rdQgFZnlsglob4yJx93lcimwMii1EpGAMjKdPPrWGvYdcPef9129mMcXP8fq\n41pyU69RZMfV95ZNjHeQmarDKWqSivShrzDGzAO+BfKBTOClYFVMRPz5bX9rLX//Mp2hn7/B0hPP\n4s5uj+CKre0tG+eIZlSXVlVYW6kKFZqHbq0dBYwKUl1Earzic8t991XxzGwxtpDUJS9z4zeLWNCq\nI8Ouupf86IN/lTW3vOYyNoRHkqSkpNiVK9UrIxJI8QMoABxRhrq1Y7wnDDkK8pjy7pN0XfcZL5/d\nncc73oQ1B4fC4hzRCvMIZIz5xlqbUlY5Lf0XqSYCzS3PK7TeMI8/4OKFtx7nwt8ymXDxDbx4Tk+/\npfxxjiiFeQ2nQBepJgLNLfdIzNnNtHljaLPtZ4ZedQ9zTz842GmAge2b8Vj3NiGopVRnCnSRaqJx\nQpz3VCFfybu3MyM9leQ927nt2kf5T8tzva9N7XumWuTipSPoRKqJoZ1OLrHfSsusjcybOZSkfbu4\nrs9YvzBPLlr9KeKhFrpINeE7m8WZ7aLdlnW8Nn8MuTGx9B44ifVJzb1ltfpTAlELXaQa6d42mWXD\nL6HXttXMmjOCP+Pq03NQml+Ya2MtKY1a6CLVzfTpTJ6Zyg9Hn8jgnqPYWScB0JREKZsCXaQKFV9I\nNHLdO1w540m+OP5Mhg0YRWHteExOXolFRiKBKNBFqojvQiJjC7n+ree48qsFLDrlAh7s/AAHCh3E\n5RXypGaySDmpD12kCmRkOnkwfTWuvAJiCvJJe+8pbvtqAdPbdebeLg9xIMYBgCuvgLTF66u4thIu\n1EIXCbERGWuYtXwTFqidt5/n3p7EpRu+5p9/G8jT5/crcZDzoRYcifhSoIuEUEamk5nLNwHQwPUX\nr84fSzvnjzx6xZ3Mant1wGuKH2YhUhoFukgIeAY/PStBj92zg+lzU2m+ayt3dh/OByd3CHid5pvL\n4VCgi1QS3xA3gGdf0xY7NzM9PZUG+/dyQ++xfHn86X7Xecoma2aLHCYFukglKL4VrifMz9i6nmnz\nxlBgoug3YCJrj2nhd51CXCpCgS5SCQJthXvBr9/ywluPs6NOAtf3GcvGxMbe1+rERrN27JWhrqZE\nGAW6SCUoPjOl6w+fMuXdf/JTo2YM7j2WrLqJ3tcc0Ybx12rrW6k4zUMXqQQJ8Q7v4xtWLuTpRWl8\nk3wqfQdM9Avz5IQ40nqdoS4WCQq10EUqgbXu/zz0+Rvc9WU6i1u2556uw8iNifWWiTaGZcMvqbpK\nSsRRoIsEWUamk7/27WfC4ufo/92HzD79CkZ0+jsFUf57nReE8DxfqRkU6CIVUHxzrY6nJLFoxa88\nnzGBTj8t55nz+jLlgkElVn+Cu7tFJJgU6CJHqPjURGe2i4WfruOl+eNov/l7Rl12G9PP6hLwWkeU\n0YIhCToFusgRKj41MWnvn0yfO4r/27GZe7oMZeFpFwW8LiHOweiurTQQKkGnQBc5Qr5TE4/ftZU3\n5oykYc5ubu6VyucntPMrm5wQpwFQqXSatihyhDybZrXa9jPzZg6j7gEXA/qN57/Fwlz7sUioKNBF\nDlNGppMOEz/Gme3i/I2reXP2I+TGOOg1cDL/O/40BrZvRnJCHAad/ymhpS4XkcPgOxB61Y//Zeo7\nT/BbYmOu7zOWmKZNmaB9WKQKKdBFDoNnIHRg5nuM+/BffJt8Cjf3TGVPXD2eVJhLFVOXi0g5eLtZ\nduVwz7LZjP/weZa2SGFQ33HsjquHBR0VJ1VOLXSRMni6WXJzDzBmyUsM/vZd5rW+lOFX3k1+9MG/\nQjoqTqqaAl2kDKMXrqXAtZ+n3/0n1/z4OS+c04OJF99YYvWnjoqTqqZAFylFRqaT0QvXkpe9m9fe\nGs/fNq5m/MU38fK5PUqU1dREqQ4U6CIBjMhYw6zlmzhqXzYz5o3mtD9+4YHO97Og9aUlykYbo6mJ\nUi0o0EV8ZGQ6GbNoLbty8miSvY0Z6akc99dObu05kqUtzi5RPs4RrTCXakOBLjWab4D7OmX7r0yf\nO4pa+QcY2Pcxvm1yaolrdf6nVDcKdKmxMjKdDJ23mrwC/33Jz978Pa/OH8c+R216D5jET0nH+72u\nVrlUVxUKdGNMAvAK0Br3weY3WWu/DEbFRCqD7/7lUcaUOGTisp9W8OzCSTjrH811fceytf7Rfq/X\niY1m/LUKc6meKtpCfwr4wFrbyxgTC8QHoU4ilaL4/uXFw7z3dx8y8YNnWXNsC27sNZpd8Q1KvEdC\nfKzCXKqtIw50Y0wD4ELgBgBr7QHgQHCqJRJ8xfcv97KW21fMZ/inr/NZ87bcfu0/yIkNPKdci4ek\nOqtIC/0EIAuYZow5A/gGuNdauy8oNRMJskBhbGwhj378KresfJu3T72IhzrfR160o9T30OIhqc4q\nEugxQDvgbmvtCmPMU8BwYKRvIWPMEGAIQLNmzSrwcSLl5+krd2a7iC7qK48u1mceU5DP5Pefosfa\npUw7qwtjL70Va0rf3kiLh6S6q8jmXFuALdbaFUVfz8Md8H6stS9Za1OstSlJSUkV+DiR8vH0lTuL\nWuSeEPcN87gD+3ll/jh6rF3KlIuuZ8ylQ0qEuSPakBDn0L7mEjaOuIVurd1mjNlsjDnZWrseuBT4\nIXhVEzkypfaVF2m4/y9eSR/N6dt+InPkZFpcO4DkYq15zTGXcFTRWS53A7OKZrj8AtxY8SqJVMyh\nBi6P25PFjPRUWu7dDgvm07Z7d9qCglsiQoUC3Vq7CkgJUl1EgqJxQpy3u8XX/+3YxIz0VOodyOHz\nZ2dyQffuVVA7kcqjAy4k4gztdDJxjmi/59o51zF31sPEFBbQZ8BEhmysS0ams4pqKFI5FOgScbq3\nTWZCjzYkF00xvHjDSma9OYLdtevSc1Aa644+EVdegU4YkoijQJeIdu3apby8YBwbGjah16DJbE44\n1vuaFglJpNHmXBJxPNMWB3wxn5Efv8IXzU5nSI8R7K3lvzOFFglJpFGgS0QYkbGG2Ss2u+eaW8vD\nn07njhXzeO+k87m/y0PkxsT6ldciIYlECnQJeyMy1jBz+SYAogsLePyDZ+m75iNmnnkVqZffTmHU\nwQFSg7tlrjnmEokU6BL2Zq/YDECtvFyeXTiZy39ewdQO/ZnaYYDfQc7JCXEsG35JVVVTpNIp0CXs\nFVhL/f17eWX+WFK2rGPE5Xcws11nvzLqYpGaQIEuYe+4vX8ybc5ITvzTyd1dh/HuqRd4X1MXi9Qk\nCnQJWxmZTt6c+R/mvjGMhP1/cUPv0XzR/Ezv64PaN+Ox7m2qsIYioaVAl7CUkelk5rPzeXH2SKwx\n9Os/ge+P/T8Aoo2h/7lNFeZS4yjQJSwtfe7fvD5zFNm163Fd33H8epS7O0UDn1KTaaWohJ/0dNKm\n/YPNDY6hx6A0b5iDVn9KzaZAl/Dy3HPQrx/rmp1C3wET2V6vod/LWv0pNZkCXcKDtfw45H646y4+\nanEOQwY+jqtOPb8impooNZ360KXa8pwLuu3PvUxa+iK9Vr7HnDaX848r76IgPwpHFCTGO8jOydPU\nRBEU6FJNeMJ7a7aLxglxdDwliTlfbyYqN5dnFj3B1f/7gufb92LyhYO9qz/zCi3xsTFkpl5RxbUX\nqR4U6FLlPLsjes4BdWa7mLl8E3Vzc3h5wTjO27SGsZfcymtndytxrQZBRQ5SoEvIFW+N78vNL3Go\nc6N9u3h97mhOzvqNe695kLdbdQz4XhoEFTlIgS4hFag1XlzT7G28MWckR+/7k1t6pvLpiWcFfC8N\ngor4U6BLSKUtXl+iNe7rtD9+YfrcVGIKChjYdzyZyaeUKKP9WUQCU6BLSB2qz/vcTWt4ef449taK\np1+/CWxo1LREmYQ4B6tGaRBUJBDNQ5eQKq3Pu9P/vmBGeip/1GtIz0FpAcPcEWUY3bVVZVdRJGwp\n0CWkOp6SVOK5vqsX83zGRNYecyK9B07i9/olyyTEOUjrfYa6WEQOQV0uElJLf8w6+IW1/P3LdIZ+\n/gZLTzyLO7s9giu2NgBRBqxVX7nI4VCgS0h5+tCNLSR1ycvc+M0iFrTqyLCr7iU/+uAfx/q11Vcu\ncrgU6BJSjRPi2L5zD1PefZKu6z7j5bO783jHm7DGv/dvtyuvimooEr4U6BJSwy9oQuL1/fnbL98y\n4eIbePGcnn4HOXtowZDI4dOgqIRERqaTziMW0Kz3NZz36yoeveY+Xjy3V8Aw14IhkSOjFrpUuhEZ\na/hk8Uqmp48keU8Wt137KP9pea5fGQNY3CcOaRBU5Mgo0CXofPdqaRDnIGnTz8xLH0l8Xi7X9RnL\n101bl7jGE+Y6Pk7kyCnQJagyMp0MnbuavEILwIk/fcdr88eQGxNL74GTWJ/UvNRrtXOiSMUo0CUo\nPK1y3822Om74muczJvJ7vYZc33ccWxocc8j30ECoSMUo0KXCRmSsYdbyTVif53quWcKk95/ih2NO\n5MZeo9lZJ+GQ76GBUJGK0ywXqZCMTGeJMB+yYj5T3nuSL5udTv9+j5cIcwN0aHEUyQlxGNx95xN6\ntNFAqEgFqYUuFZK2eL03zI0tZPgnr3PbVwtYdMoFPNj5AQ7EOAAt5RcJBQW6VIhnIDOmIJ9JHzxN\nz+8/5vV21zDmsiHe1Z+OaENaL22sJVLZKhzoxphoYCXgtNZeU/EqSThpnBDHzqxdPPf2JC7d8DVT\n/jaQZ87v510wlBjvYFSXVgpzkRAIRgv9XmAdUD8I71WjZWQ6Gb1wLdlF+5hURhgWP8+zIt0fGZlO\nCnfsYNabo2i7dT2PXnEns9pe7VcmPjZGYS4SIhUaFDXGNAE6A68Epzo1l2f+drbPplS7cvK4b84q\nRmSsCdpnPLJgDc5sFxb3eZ6PLFhDRqbziN5r6usfM/31obT+42fu7D68RJiD5paLhFJFZ7lMBYYB\nhaUVMMYMMcasNMaszMrKKq1YjZe2eL13MU5xs5ZvOqLQDfQZxc/zdOUVkLZ4/WG/1/TX3mfWtAc5\n7q8sbug9lg9O7hCwnOaWi4TOEQe6MeYaYLu19ptDlbPWvmStTbHWpiQllTyJRtwO1ZK1cEShW97P\ncGa7OGH4u3SY+HG5/uH45I13ePXVB4ktyKPfgIl8efzpActpbrlIaFWkD70D0NUYczVQG6hvjJlp\nrR0UnKr643+gAAAJrklEQVTVDJ4+7cBt84NKC+Pi+6YYA9k5eX6PPX3ljRPi/FZy+vLtggFK7/f+\n8EPOuaU3WfEJXN9nLBsTGwcspsFQkdAz1pYVJeV4E2MuBh4qa5ZLSkqKXblyZYU/r7or78Cjp0+7\neDdIIAlxDurUivF7T8Bv35RD8exmWB6eTbIyMp2MWbSWXTnufv1+P/+Xx99+gh+Pasrg3mPJqpsY\n8Pqpfc9UkIsEkTHmG2ttSlnlNA89yIqH9KFavYH6tANxRBn2Hcj3Dpg6s13cN2fVIUPa2EISXX+R\ntG8XjfZl06jo96Sc7KKvs2mU436+Xm4O6adfzuQLB+OKrc3WbJd7kHbeavIK3J9ww8qFjF7yEiua\nteGhgWPIKowN+LnJCXEKc5EqEpQWenlVpxZ6MKfv+eow8eNSuzUSinWDlFbOV7wjilqOaHbl5BFV\nWMBRrj3eQPaGdVEwJ3kfZ3NUzm5ibMmx6gNRMeyok+D+Fe/+vVZ+Ht3WfcqmBsfw8FX38uOpZ7Hb\nlUehBazloc/f4K4v0/ngpPO4t8tQ4urVYd+BfG/YeziiDGm9tYBIJNjK20KvkYEeaDMp8O/3LSvw\nfXcXjDaGAmtJLmdI+4ouLOConN3uMN63yxvIJQM7m6Nce4gOENK50Q6y/EI6kR11Esiqk+gNbc/X\ne2rVCXhK0Dmbv2fS+09xwq7fmdX2KiZcdCMuRy0eW/wc/b/7kH+f0YkRV9xJYVQ0Bniy75l+3TEJ\ncQ5Gd1WfuUhliMhA9w3RKAOeruOywsQ3nBPiHd4QCiTOEU3Ps5KZ/43TrzskzhHt3UCqtH8QwN1X\nHV2QT8McTyhn+4T0wcD2hHWi6y+iAryTK6aWXys6y+fxjjqJRQHuDu6/YuMDhvThqp23n4eW/Zsb\nv8pgW92GbGjYhAt/y+SZ8/oy5YJB3s/QQRQioRVxgV6eAcTEeIffrA5PS7u8A48enha3L0dBHq1i\ncrmiEXy1fB1Jni4Ov8B2Pz7KtSfg++Y4apVoNe+ITzzYui4K7aw6ieyLjfML6ThH9GHdQ0WcuXU9\nae9NpeXOzYy67Damn9XF+5q6VURCL+IC/VB904F4WtTFD13wFZufR6OcXQfDuCiQD4b1wS6PhP17\nA77H3ti4otZzYom+6R11EsjyeT4n9sgW2XjO2fTdFiCQxHj3zoaH+gkE8PvppjSx+Xkc91eW37RE\nT1eLwlwktCJulkt5l5DXysv1zuR4b/wXXLC3WL+0zwBi/dx9Ad9jT2y8N4TXJx3Psjpn+PVNu1vR\n7tf3O2oH8zYD8vy04fmJ48H01SV+ggD3vilDO51c4icSR7ShTmwMu10Hf3op7R+6xHgHe3PzOYDD\nL8zVMhep/sIm0D2zQi75+SuO3bvTp1XtH9j1D+QEvH53rTreQF6XdAKfNy8aNCw+mBjfgFxHrRDf\nXekGtW/mF6Ld2yZz/5xVActuzXZ5y5ZnBk/x4I9zRDOqSysADXiKhKGwCXRPy3P84uc4bu9OAHbV\nrucN5LXHtPCb2ZFV92Bf9c74BO9BC5UpUN97cWUt8Ik2hkJrDxnEpU159Oyb4mnNH0pZwa/wFgk/\nYRPo3dsms3Ljnwzu9xi7Y+P4M74BedGVH9LFFQ9kAwxs34zHurcpcwDWM4Nm6Y9ZOLNdJd7LdybN\noQTqVjmSfVPKE/wiEj7CJtAzMp3uqYQNm4bk8xzRBix+y+p9A/lQrdpAe6scai774S5uOpxuFRGp\nOcJ+lkugPU7uK6WPGdxBXXyFo+d9iocvKDRFpOrVmFkuu115rBp1hd9zpc3gSPaZ4VHekFaAi0i4\nCJtAL2sg0Neh+pjVbywikaqiJxaFzNBOJxPniPZ7rrSBwO5tk5nQow3JCXEY3C3z8gw2ioiEs7Bp\noR/uQKBa4iJS04RNoINCWkTkUMKmy0VERA5NgS4iEiEU6CIiEUKBLiISIRToIiIRIqRL/40xWcDG\nCr5NI2BHEKoTLnS/kU33G9mCdb/HW2uTyioU0kAPBmPMyvLsaRApdL+RTfcb2UJ9v+pyERGJEAp0\nEZEIEY6B/lJVVyDEdL+RTfcb2UJ6v2HXhy4iIoGFYwtdREQCqLaBboy50hiz3hjzszFmeIDXaxlj\n5hS9vsIY0zz0tQyectzvA8aYH4wx3xljlhhjjq+KegZLWffrU66nMcYaY8J6ZkR57tcY06foe7zW\nGPPvUNcxmMrx57mZMWapMSaz6M/01VVRz2AwxrxmjNlujPm+lNeNMebpov8X3xlj2lVaZay11e4X\nEA1sAE4EYoHVwGnFytwJvFD0uB8wp6rrXcn32xGIL3p8R6Tfb1G5esBnwHIgparrXcnf35ZAJpBY\n9PXRVV3vSr7fl4A7ih6fBvxW1fWuwP1eCLQDvi/l9auB93GfKd8eWFFZdamuLfRzgJ+ttb9Yaw8A\nbwLdipXpBkwvejwPuNQYY0JYx2Aq836ttUuttTlFXy4HmoS4jsFUnu8vwDhgErA/lJWrBOW531uB\n56y1uwCstdtDXMdgKs/9WqB+0eMGwNYQ1i+orLWfAX8eokg3YIZ1Ww4kGGOOq4y6VNdATwY2+3y9\npei5gGWstfnAbqBhSGoXfOW5X1834/4XP1yVeb9FP5Y2tda+G8qKVZLyfH9PAk4yxiwzxiw3xlwZ\nstoFX3nudzQwyBizBXgPuDs0VasSh/v3+4iF1QEXAsaYQUAKcFFV16WyGGOigH8CN1RxVUIpBne3\ny8W4f/r6zBjTxlqbXaW1qjz9gdettVOMMecBbxhjWltrC6u6YuGsurbQnUBTn6+bFD0XsIwxJgb3\nj207Q1K74CvP/WKMuQx4FOhqrc0NUd0qQ1n3Ww9oDXxijPkNd7/jwjAeGC3P93cLsNBam2et/RX4\nH+6AD0flud+bgXQAa+2XQG3c+55EonL9/Q6G6hroXwMtjTEnGGNicQ96LixWZiEwuOhxL+BjWzQC\nEYbKvF9jTFvgRdxhHs79q1DG/Vprd1trG1lrm1trm+MeM+hqrV1ZNdWtsPL8ec7A3TrHGNMIdxfM\nL6GsZBCV5343AZcCGGNOxR3oWSGtZegsBK4vmu3SHthtrf29Uj6pqkeIDzFyfDXuVsoG4NGi58bi\n/osN7j8Ac4Gfga+AE6u6zpV8v/8B/gBWFf1aWNV1rsz7LVb2E8J4lks5v78GdzfTD8AaoF9V17mS\n7/c0YBnuGTCrgCuqus4VuNfZwO9AHu6ftG4Gbgdu9/nePlf0/2JNZf5Z1kpREZEIUV27XERE5DAp\n0EVEIoQCXUQkQijQRUQihAJdRCRCKNBFRCKEAl1EJEIo0EVEIsT/A9s/vksORoTUAAAAAElFTkSu\nQmCC\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x10a4d8668>"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fig"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: classification_and_regression_trees/prune.py
================================================
#!/usr/bin/env python
# -*- coding: utf-8 -*-

from regression_tree import *

def not_tree(tree):
    ''' 判断是否不是一棵树结构
    '''
    return type(tree) is not dict

def collapse(tree):
    ''' 对一棵树进行塌陷处理, 得到给定树结构的平均值
    '''
    if not_tree(tree):
        return tree
    ltree, rtree = tree['left'], tree['right']
    return (collapse(ltree) + collapse(rtree))/2

def postprune(tree, test_data):
    ''' 根据测试数据对树结构进行后剪枝
    '''
    if not_tree(tree):
        return tree

    # 若没有测试数据则直接返回树平均值
    if not test_data:
        return collapse(tree)

    ltree, rtree = tree['left'], tree['right']

    if not_tree(ltree) and not_tree(rtree):
        # 分割数据用于测试
        ldata, rdata = split_dataset(test_data, tree['feat_idx'], tree['feat_val'])
        # 分别计算合并前和合并后的测试数据误差
        err_no_merge = (np.sum((np.array(ldata) - ltree)**2) +
                        np.sum((np.array(rdata) - rtree)**2))
        err_merge = np.sum((np.array(test_data) - (ltree + rtree)/2)**2)

        if err_merge < err_no_merge:
            print('merged')
            return (ltree + rtree)/2
        else:
            return tree

    tree['left'] = postprune(tree['left'], test_data)
    tree['right'] = postprune(tree['right'], test_data)

    return tree



================================================
FILE: classification_and_regression_trees/regression_tree.py
================================================
#!/usr/bin/env python
# -*- coding: utf-8 -*-

''' 回归树实现
'''

import uuid
from functools import namedtuple

import numpy as np
import matplotlib.pyplot as plt


def load_data(filename):
    ''' 加载文本文件中的数据.
    '''
    dataset = []
    with open(filename, 'r') as f:
        for line in f:
            line_data = [float(data) for data in line.split()]
            dataset.append(line_data)
    return dataset

def split_dataset(dataset, feat_idx, value):
    ''' 根据给定的特征编号和特征值对数据集进行分割
    '''
    ldata, rdata = [], []
    for data in dataset:
        if data[feat_idx] < value:
            ldata.append(data)
        else:
            rdata.append(data)
    return ldata, rdata

def create_tree(dataset, fleaf, ferr, opt=None):
    ''' 递归创建树结构

    dataset: 待划分的数据集
    fleaf: 创建叶子节点的函数
    ferr: 计算数据误差的函数
    opt: 回归树参数.
        err_tolerance: 最小误差下降值;
        n_tolerance: 数据切分最小样本数
    '''
    if opt is None:
        opt = {'err_tolerance': 1, 'n_tolerance': 4}

    # 选择最优化分特征和特征值
    feat_idx, value = choose_best_feature(dataset, fleaf, ferr, opt)
    
    # 触底条件
    if feat_idx is None:
        return value

    # 创建回归树
    tree = {'feat_idx': feat_idx, 'feat_val': value}

    # 递归创建左子树和右子树
    ldata, rdata = split_dataset(dataset, feat_idx, value)
    ltree = create_tree(ldata, fleaf, ferr, opt)
    rtree = create_tree(rdata, fleaf, ferr, opt)
    tree['left'] = ltree
    tree['right'] = rtree

    return tree

def fleaf(dataset):
    ''' 计算给定数据的叶节点数值, 这里为均值
    '''
    dataset = np.array(dataset)
    return np.mean(dataset[:, -1])

def ferr(dataset):
    ''' 计算数据集的误差.
    '''
    dataset = np.array(dataset)
    m, _ = dataset.shape
    return np.var(dataset[:, -1])*dataset.shape[0]

def choose_best_feature(dataset, fleaf, ferr, opt):
    ''' 选取最佳分割特征和特征值

    dataset: 待划分的数据集
    fleaf: 创建叶子节点的函数
    ferr: 计算数据误差的函数
    opt: 回归树参数.
        err_tolerance: 最小误差下降值;
        n_tolerance: 数据切分最小样本数
    '''
    dataset = np.array(dataset)
    m, n = dataset.shape
    err_tolerance, n_tolerance = opt['err_tolerance'], opt['n_tolerance']

    err = ferr(dataset)
    best_feat_idx, best_feat_val, best_err = 0, 0, float('inf')

    # 遍历所有特征
    for feat_idx in range(n-1):
        values = dataset[:, feat_idx]
        # 遍历所有特征值
        for val in values:
            # 按照当前特征和特征值分割数据
            ldata, rdata = split_dataset(dataset.tolist(), feat_idx, val)
            if len(ldata) < n_tolerance or len(rdata) < n_tolerance:
                # 如果切分的样本量太小
                continue

            # 计算误差
            new_err = ferr(ldata) + ferr(rdata)
            if new_err < best_err:
                best_feat_idx = feat_idx
                best_feat_val = val
                best_err = new_err

    # 如果误差变化并不大归为一类
    if abs(err - best_err) < err_tolerance:
        return None, fleaf(dataset)

    # 检查分割样本量是不是太小
    ldata, rdata = split_dataset(dataset.tolist(), best_feat_idx, best_feat_val)
    if len(ldata) < n_tolerance or len(rdata) < n_tolerance:
        return None, fleaf(dataset)

    return best_feat_idx, best_feat_val

def get_nodes_edges(tree, root_node=None):
    ''' 返回树中所有节点和边
    '''
    Node = namedtuple('Node', ['id', 'label'])
    Edge = namedtuple('Edge', ['start', 'end'])

    nodes, edges = [], []

    if type(tree) is not dict:
        return nodes, edges

    if root_node is None:
        label = '{}: {}'.format(tree['feat_idx'], tree['feat_val'])
        root_node = Node._make([uuid.uuid4(), label])
        nodes.append(root_node)

    for sub_tree in (tree['left'], tree['right']):
        if type(sub_tree) is dict:
            node_label = '{}: {}'.format(sub_tree['feat_idx'], sub_tree['feat_val'])
        else:
            node_label = '{:.2f}'.format(sub_tree)
        sub_node = Node._make([uuid.uuid4(), node_label])
        nodes.append(sub_node)

        edge = Edge._make([root_node, sub_node])
        edges.append(edge)

        sub_nodes, sub_edges = get_nodes_edges(sub_tree, root_node=sub_node)
        nodes.extend(sub_nodes)
        edges.extend(sub_edges)

    return nodes, edges

def dotify(tree):
    ''' 获取树的Graphviz Dot文件的内容
    '''
    content = 'digraph decision_tree {\n'
    nodes, edges = get_nodes_edges(tree)

    for node in nodes:
        content += '    "{}" [label="{}"];\n'.format(node.id, node.label)

    for edge in edges:
        start, end = edge.start, edge.end
        content += '    "{}" -> "{}";\n'.format(start.id, end.id)
    content += '}'

    return content

def tree_predict(data, tree):
    ''' 根据给定的回归树预测数据值
    '''
    if type(tree) is not dict:
        return tree

    feat_idx, feat_val = tree['feat_idx'], tree['feat_val']
    if data[feat_idx] < feat_val:
        sub_tree = tree['left']
    else:
        sub_tree = tree['right']

    return tree_predict(data, sub_tree)

if '__main__' == __name__:
    datafile = 'ex0.txt'
    dataset = load_data(datafile)
    tree = create_tree(dataset, fleaf, ferr, opt={'n_tolerance': 4,
                                                  'err_tolerance': 1})

    dotfile = '{}.dot'.format(datafile.split('.')[0])
    with open(dotfile, 'w') as f:
        content = dotify(tree)
        f.write(content)

    dataset = np.array(dataset)
    # 绘制散点
    plt.scatter(dataset[:, 0], dataset[:, 1])
    # 绘制回归曲线
    x = np.linspace(0, 1, 50)
    y = [tree_predict([i], tree) for i in x]
    plt.plot(x, y, c='r')
    plt.show()



================================================
FILE: decision_tree/english_big.txt
================================================
Urgent! call 09061749602 from Landline. Your complimentary 4* Tenerife Holiday or 10,000 cash await collection SAE T&Cs BOX 528 HP20 1YF 150ppm 18+,spam
+449071512431 URGENT! This is the 2nd attempt to contact U!U have WON 1250 CALL 09071512433 b4 050703 T&CsBCM4235WC1N3XX. callcost 150ppm mobilesvary. max7. 50,spam
FREE for 1st week! No1 Nokia tone 4 ur mob every week just txt NOKIA to 8007 Get txting and tell ur mates www.getzed.co.uk POBox 36504 W45WQ norm150p/tone 16+,spam
Urgent! call 09066612661 from landline. Your complementary 4* Tenerife Holiday or 10,000 cash await collection SAE T&Cs PO Box 3 WA14 2PX 150ppm 18+ Sender: Hol Offer,spam
WINNER!! As a valued network customer you have been selected to receivea 900 prize reward! To claim call 09061701461. Claim code KL341. Valid 12 hours only.,spam
okmail: Dear Dave this is your final notice to collect your 4* Tenerife Holiday or #5000 CASH award! Call 09061743806 from landline. TCs SAE Box326 CW25WX 150ppm,spam
07732584351 - Rodger Burns - MSG = We tried to call you re your reply to our sms for a free nokia mobile + free camcorder. Please call now 08000930705 for delivery tomorrow,spam
"URGENT! This is the 2nd attempt to contact U!U have WON 1000CALL 09071512432 b4 300603t&csBCM4235WC1N3XX.callcost150ppmmobilesvary. max7. 50",spam
Congrats! Nokia 3650 video camera phone is your Call 09066382422 Calls cost 150ppm Ave call 3mins vary from mobiles 16+ Close 300603 post BCM4284 Ldn WC1N3XX,spam
Urgent! Please call 0906346330. Your ABTA complimentary 4* Spanish Holiday or 10,000 cash await collection SAE T&Cs BOX 47 PO19 2EZ 150ppm 18+,spam
Congrats 2 mobile 3G Videophones R yours. call 09063458130 now! videochat wid ur mates, play java games, Dload polypH music, noline rentl. bx420. ip4. 5we. 150p,spam
Dear 0776xxxxxxx U've been invited to XCHAT. This is our final attempt to contact u! Txt CHAT to 86688 150p/MsgrcvdHG/Suite342/2Lands/Row/W1J6HL LDN 18yrs,spam
Win the newest Harry Potter and the Order of the Phoenix (Book 5) reply HARRY, answer 5 questions - chance to be the first among readers!,spam
SMS AUCTION - A BRAND NEW Nokia 7250 is up 4 auction today! Auction is FREE 2 join & take part! Txt NOKIA to 86021 now!,spam
09066362231 URGENT! Your mobile No 07xxxxxxxxx won a 2,000 bonus caller prize on 02/06/03! this is the 2nd attempt to reach YOU! call 09066362231 ASAP!,spam
Dear U've been invited to XCHAT. This is our final attempt to contact u! Txt CHAT to 86688,spam
449050000301 You have won a 2,000 price! To claim, call 09050000301.,spam
YOU ARE CHOSEN TO RECEIVE A 350 AWARD! Pls call claim number 09066364311 to collect your award which you are selected to receive as a valued mobile customer.,spam
44 7732584351, Do you want a New Nokia 3510i colour phone DeliveredTomorrow? With 300 free minutes to any mobile + 100 free texts + Free Camcorder reply or call 08000930705.,spam
URGENT! Your mobile was awarded a 1,500 Bonus Caller Prize on 27/6/03. Our final attempt 2 contact U! Call 08714714011,spam
Congrats! 2 mobile 3G Videophones R yours. call 09063458130 now! videochat wid your mates, play java games, Dload polyPH music, noline rentl.,spam
Wan2 win a Meet+Greet with Westlife 4 U or a m8? They are currently on what tour? 1)Unbreakable, 2)Untamed, 3)Unkempt. Text 1,2 or 3 to 83049. Cost 50p +std text,spam
URGENT This is our 2nd attempt to contact U. Your 900 prize from YESTERDAY is still awaiting collection. To claim CALL NOW 09061702893,spam
Want explicit SEX in 30 secs? Ring 02073162414 now! Costs 20p/min,spam
Sorry I missed your call let's talk when you have the time. I'm on 07090201529,spam
Congratulations YOU'VE Won. You're a Winner in our August 1000 Prize Draw. Call 09066660100 NOW. Prize Code 2309.,spam
Fantasy Football is back on your TV. Go to Sky Gamestar on Sky Active and play 250k Dream Team. Scoring starts on Saturday, so register now!SKY OPT OUT to 88088,spam
87077: Kick off a new season with 2wks FREE goals & news to ur mobile! Txt ur club name to 87077 eg VILLA to 87077,spam
This is the 2nd attempt to contract U, you have won this weeks top prize of either 1000 cash or 200 prize. Just call 09066361921,spam
You have won ?1,000 cash or a ?2,000 prize! To claim, call09050000327,spam
Talk sexy!! Make new friends or fall in love in the worlds most discreet text dating service. Just text VIP to 83110 and see who you could meet.,spam
Todays Vodafone numbers ending with 4882 are selected to a receive a 350 award. If your number matches call 09064019014 to receive your 350 award.,spam
GENT! We are trying to contact you. Last weekends draw shows that you won a 1000 prize GUARANTEED. Call 09064012160. Claim Code K52. Valid 12hrs only. 150ppm ,spam
Camera - You are awarded a SiPix Digital Camera! call 09061221066 fromm landline. Delivery within 28 days.,spam
YOU VE WON! Your 4* Costa Del Sol Holiday or 5000 await collection. Call 09050090044 Now toClaim. SAE, TC s, POBox334, Stockport, SK38xh, Cost1.50/pm, Max10mins,spam
WELL DONE! Your 4* Costa Del Sol Holiday or 5000 await collection. Call 09050090044 Now toClaim. SAE, TCs, POBox334, Stockport, SK38xh, Cost1.50/pm, Max10mins,spam
Camera - You are awarded a SiPix Digital Camera! call 09061221066 fromm landline. Delivery within 28 days,spam
Congratulations ur awarded 500 of CD vouchers or 125gift guaranteed & Free entry 2 100 wkly draw txt MUSIC to 87066,spam
Loan for any purpose 500 - 75,000. Homeowners + Tenants welcome. Have you been previously refused? We can still help. Call Free 0800 1956669 or text back 'help',spam
This is the 2nd time we have tried 2 contact u. U have won the 750 Pound prize. 2 claim is easy, call 087187272008 NOW1! Only 10p per minute. BT-national-rate.,spam
Congrats! 1 year special cinema pass for 2 is yours. call 09061209465 now! C Suprman V, Matrix3, StarWars3, etc all 4 FREE! bx420-ip4-5we. 150pm. Dont miss out!,spam
Message Important information for O2 user. Today is your lucky day! 2 find out why log onto http://www.urawinner.com there is a fantastic surprise awaiting you,spam
Had your mobile 11 months or more? U R entitled to Update to the latest colour mobiles with camera for Free! Call The Mobile Update Co FREE on 08002986030,spam
Bloomberg -Message center +447797706009 Why wait? Apply for your future http://careers. bloomberg.com,spam
Sppok up ur mob with a Halloween collection of nokia logo&pic message plus a FREE eerie tone, txt CARD SPOOK to 8007,spam
25p 4 alfie Moon's Children in need song on ur mob. Tell ur m8s. Txt Tone charity to 8007 for Nokias or Poly charity for polys: zed 08701417012 profit 2 charity.,spam
URGENT!: Your Mobile No. was awarded a 2,000 Bonus Caller Prize on 02/09/03! This is our 2nd attempt to contact YOU! Call 0871-872-9755 BOX95QU,spam
Phony 350 award - Todays Voda numbers ending XXXX are selected to receive a 350 award. If you have a match please call 08712300220 quoting claim code 3100 standard rates app,spam
we tried to contact you re your response to our offer of a new nokia fone and camcorder hit reply or call 08000930705 for delivery,spam
Hello from Orange. For 1 month's free access to games, news and sport, plus 10 free texts and 20 photo messages, reply YES. Terms apply: www.orange.co.uk/ow,spam
Ur HMV Quiz cash-balance is currently 500 - to maximize ur cash-in now send HMV1 to 86688 only 150p/msg,spam
YOU HAVE WON! As a valued Vodafone customer our computer has picked YOU to win a 150 prize. To collect is easy. Just call 09061743386,spam
Congratulations ur awarded either a yrs supply of CDs from Virgin Records or a Mystery Gift GUARANTEED Call 09061104283 Ts&Cs www.smsco.net 1.50pm approx 3mins,spam
A 400 XMAS REWARD IS WAITING FOR YOU! Our computer has randomly picked you from our loyal mobile customers to receive a 400 reward. Just call 09066380611 ,spam
December only! Had your mobile 11mths+? You are entitled to update to the latest colour camera mobile for Free! Call The Mobile Update Co FREE on 08002986906,spam
74355 XMAS iscoming & ur awarded either 500 CD gift vouchers & free entry 2 r 100 weekly draw txt MUSIC to 87066 TnC,spam
SIX chances to win CASH! From 100 to 20,000 pounds txt> CSH11 and send to 87575. Cost 150p/day, 6days, 16+ TsandCs apply Reply HL 4 info,spam
Todays Voda numbers ending 7548 are selected to receive a $350 award. If you have a match please call 08712300220 quoting claim code 4041 standard rates app,spam
Congratulations! Thanks to a good friend U have WON the 2,000 Xmas prize. 2 claim is easy, just call 08718726978 NOW! Only 10p per minute. BT-national-rate,spam
You have WON a guaranteed 1000 cash or a 2000 prize. To claim yr prize call our customer service representative on 08714712379 between 10am-7pm Cost 10p,spam
You are a winner you have been specially selected to receive 1000 cash or a 2000 award. Speak to a live operator to claim call 087147123779am-7pm. Cost 10p,spam
INTERFLORA - It's not too late to order Interflora flowers for christmas call 0800 505060 to place your order before Midnight tomorrow.,spam
8007 FREE for 1st week! No1 Nokia tone 4 ur mob every week just txt NOKIA to 8007 Get txting and tell ur mates www.getzed.co.uk POBox 36504 W4 5WQ norm 150p/tone 16+,spam
Congratulations ur awarded either 500 of CD gift vouchers & Free entry 2 our 100 weekly draw txt MUSIC to 87066 TnCs www.Ldew.com 1 win150ppmx3age16,spam
"For the most sparkling shopping breaks from 45 per person; call 0121 2025050 or visit www.shortbreaks.org.uk",spam
Are you unique enough? Find out from 30th August. www.areyouunique.co.uk,spam
WINNER! As a valued network customer you hvae been selected to receive a 900 reward! To collect call 09061701444. Valid 24 hours only. ACL03530150PM,spam
Congratulations U can claim 2 VIP row A Tickets 2 C Blu in concert in November or Blu gift guaranteed Call 09061104276 to claim TS&Cs www.smsco.net cost3.75max ,spam
This is the 2nd time we have tried to contact u. U have won the 1450 prize to claim just call 09053750005 b4 310303. T&Cs/stop SMS 08718725756. 140ppm,spam
Urgent Ur 500 guaranteed award is still unclaimed! Call 09066368327 NOW closingdate04/09/02 claimcode M39M51 1.50pmmorefrommobile2Bremoved-MobyPOBox734LS27YF,spam
If you don't, your prize will go to another customer. T&C at www.t-c.biz 18+ 150p/min Polo Ltd Suite 373 London W1J 6HL Please call back if busy,spam
No 1 POLYPHONIC tone 4 ur mob every week! Just txt PT2 to 87575. 1st Tone FREE ! so get txtin now and tell ur friends. 150p/tone. 16 reply HL 4info,spam
I don't know u and u don't know me. Send CHAT to 86688 now and let's find each other! Only 150p/Msg rcvd. HG/Suite342/2Lands/Row/W1J6HL LDN. 18 years or over.,spam
Send a logo 2 ur lover - 2 names joined by a heart. Txt LOVE NAME1 NAME2 MOBNO eg LOVE ADAM EVE 07123456789 to 87077 Yahoo! POBox36504W45WQ TxtNO 4 no ads 150p,spam
HMV BONUS SPECIAL 500 pounds of genuine HMV vouchers to be won. Just answer 4 easy questions. Play Now! Send HMV to 86688 More info:www.100percent-real.com,spam
Please call our customer service representative on 0800 169 6031 between 10am-9pm as you have WON a guaranteed 1000 cash or 5000 prize!,spam
You are being contacted by our dating service by someone you know! To find out who it is, call from a land line 09050000878. PoBox45W2TG150P,spam
83039 62735=450 UK Break AccommodationVouchers terms & conditions apply. 2 claim you mustprovide your claim number which is 15541 ,spam
You have an important customer service announcement from PREMIER. Call FREEPHONE 0800 542 0578 now!,spam
You are awarded a SiPix Digital Camera! call 09061221061 from landline. Delivery within 28days. T Cs Box177. M221BP. 2yr warranty. 150ppm. 16 . p p3.99,spam
Please call our customer service representative on FREEPHONE 0808 145 4742 between 9am-11pm as you have WON a guaranteed 1000 cash or 5000 prize!,spam
You are a winner U have been specially selected 2 receive 1000 cash or a 4* holiday (flights inc) speak to a live operator 2 claim 0871277810810,spam
"Hey sorry I didntgive ya a a bellearlier hunny,just been in bedbut mite go 2 thepub l8tr if uwana mt up?loads a luv Jenxxx.",ham
"Are you comingdown later?",ham
"HEY HEY WERETHE MONKEESPEOPLE SAY WE MONKEYAROUND! HOWDY GORGEOUS, HOWU DOIN? FOUNDURSELF A JOBYET SAUSAGE?LOVE JEN XXX",ham
"CHA QUITEAMUZING THATSCOOL BABE,PROBPOP IN & CU SATTHEN HUNNY 4BREKKIE! LOVE JEN XXX. PSXTRA LRG PORTIONS 4 ME PLEASE ",ham
"HEY BABE! FAR 2 SPUN-OUT 2 SPK AT DA MO... DEAD 2 DA WRLD. BEEN SLEEPING ON DA SOFA ALL DAY, HAD A COOL NYTHO, TX 4 FONIN HON, CALL 2MWEN IM BK FRMCLOUD 9! J X",ham
"CHEERS U TEX MECAUSE U WEREBORED! YEAH OKDEN HUNNY R UIN WK SAT?SOUNDS LIKEYOUR HAVIN GR8FUN J! KEEP UPDAT COUNTINLOTS OF LOVEME XXXXX.",ham
"EY! CALM DOWNON THEACUSATIONS.. ITXT U COS IWANA KNOW WOTU R DOIN AT THEW/END... HAVENTCN U IN AGES..RING ME IF UR UP4 NETHING SAT.LOVE J XXX.",ham
"YEH I AM DEF UP4 SOMETHING SAT,JUST GOT PAYED2DAY & I HAVBEEN GIVEN A50 PAY RISE 4MY WORK & HAVEBEEN MADE PRESCHOOLCO-ORDINATOR 2I AM FEELINGOOD LUV",ham
"Hi its Kate it was lovely to see you tonight and ill phone you tomorrow. I got to sing and a guy gave me his card! xxx",ham
"Thinking of u ;) x",ham
Me too! Have a lovely night xxx,ham
Hey hun-onbus goin 2 meet him. He wants 2go out 4a meal but I donyt feel like it cuz have 2 get last bus home!But hes sweet latelyxxx,ham
Hi mate its RV did u hav a nice hol just a message 3 say hello coz havent sent u 1 in ages started driving so stay off roads!RVx,ham
IM FINE BABES AINT BEEN UP 2 MUCH THO! SAW SCARY MOVIE YEST ITS QUITE FUNNY! WANT 2MRW AFTERNOON? AT TOWN OR MALL OR SUMTHIN?xx,ham
I notice you like looking in the shit mirror youre turning into a right freak,ham
IM LATE TELLMISS IM ON MY WAY,ham
Been up to ne thing interesting. Did you have a good birthday? When are u wrking nxt? I started uni today.,ham
IM GONNAMISSU SO MUCH!!I WOULD SAY IL SEND U A POSTCARD BUTTHERES ABOUTAS MUCH CHANCE OF MEREMEMBERIN ASTHERE IS OFSI NOT BREAKIN HIS CONTRACT!! LUV Yaxx,ham
Thanx 4 the time weve spent 2geva, its bin mint! Ur my Baby and all I want is u!xxxx,ham
You stayin out of trouble stranger!!saw Dave the other day hes sorted now!still with me bloke when u gona get a girl MR!ur mum still Thinks we will get 2GETHA! ,ham
THANX 4 PUTTIN DA FONE DOWN ON ME!!,ham
I know dat feelin had it with Pete! Wuld get with em , nuther place nuther time mayb?,ham
U 2.,ham
Thanx u darlin!im cool thanx. A few bday drinks 2 nite. 2morrow off! Take care c u soon.xxx,ham
HIYA COMIN 2 BRISTOL 1 ST WEEK IN APRIL. LES GOT OFF + RUDI ON NEW YRS EVE BUT I WAS SNORING.THEY WERE DRUNK! U BAK AT COLLEGE YET? MY WORK SENDS INK 2 BATH.,ham
Sez, hows u & de arab boy? Hope u r all good give my love 2 evry1 love ya eshxxxxxxxxxxx,ham
THING R GOOD THANX GOT EXAMS IN MARCH IVE DONE NO REVISION? IS FRAN STILL WITH BOYF? IVE GOTTA INTERVIW 4 EXETER BIT WORRIED!x,ham
I love u 2 babe! R u sure everything is alrite. Is he being an idiot? Txt bak girlie,ham
I luv u soo much u dont understand how special u r 2 me ring u 2morrow luv u xxx,ham
NOT MUCH NO FIGHTS. IT WAS A GOOD NITE!!,ham


================================================
FILE: decision_tree/lenses.dot
================================================
digraph decision_tree {
    "99d3b650-7557-420c-be5f-037403909eef" [label="tearRate"];
    "ccf5c62e-14ca-4cef-9525-4b8f026622dc" [label="no lenses"];
    "6a72f3f9-51ce-4433-b052-34765c65a61e" [label="astigmatic"];
    "91ea78df-9cfd-4334-a592-1c8b3c193f0d" [label="age"];
    "b5d2e2b7-241b-4c46-a56b-61ba9a1e7678" [label="soft"];
    "62193a33-c49d-4bce-b820-1613685e09ce" [label="soft"];
    "01240d64-7b96-40fc-9a4b-185cc0fca9d6" [label="prescript"];
    "5571119a-43b5-414e-9bf5-c9c62a9dee8c" [label="soft"];
    "087246f9-495f-44ef-8ea0-5043b238c1c1" [label="no lenses"];
    "c0b04ca3-692d-4498-8292-165ed4997ce5" [label="prescript"];
    "8f6cfe1f-a0ea-46de-a456-f3f8b35bca8d" [label="age"];
    "4d2b5c7f-e85e-4d44-8da9-0d88de048430" [label="hard"];
    "cdc375a5-561f-48c9-a847-ccc73f1cc44c" [label="no lenses"];
    "4600fda0-b8a8-45cc-8174-d554de9b7e84" [label="no lenses"];
    "08a19fa5-952c-4ab3-a283-dbfe2e3e5870" [label="hard"];
    "99d3b650-7557-420c-be5f-037403909eef" -> "ccf5c62e-14ca-4cef-9525-4b8f026622dc" [label="reduced"];
    "99d3b650-7557-420c-be5f-037403909eef" -> "6a72f3f9-51ce-4433-b052-34765c65a61e" [label="normal"];
    "6a72f3f9-51ce-4433-b052-34765c65a61e" -> "91ea78df-9cfd-4334-a592-1c8b3c193f0d" [label="no"];
    "91ea78df-9cfd-4334-a592-1c8b3c193f0d" -> "b5d2e2b7-241b-4c46-a56b-61ba9a1e7678" [label="young"];
    "91ea78df-9cfd-4334-a592-1c8b3c193f0d" -> "62193a33-c49d-4bce-b820-1613685e09ce" [label="pre"];
    "91ea78df-9cfd-4334-a592-1c8b3c193f0d" -> "01240d64-7b96-40fc-9a4b-185cc0fca9d6" [label="presbyopic"];
    "01240d64-7b96-40fc-9a4b-185cc0fca9d6" -> "5571119a-43b5-414e-9bf5-c9c62a9dee8c" [label="hyper"];
    "01240d64-7b96-40fc-9a4b-185cc0fca9d6" -> "087246f9-495f-44ef-8ea0-5043b238c1c1" [label="myope"];
    "6a72f3f9-51ce-4433-b052-34765c65a61e" -> "c0b04ca3-692d-4498-8292-165ed4997ce5" [label="yes"];
    "c0b04ca3-692d-4498-8292-165ed4997ce5" -> "8f6cfe1f-a0ea-46de-a456-f3f8b35bca8d" [label="hyper"];
    "8f6cfe1f-a0ea-46de-a456-f3f8b35bca8d" -> "4d2b5c7f-e85e-4d44-8da9-0d88de048430" [label="young"];
    "8f6cfe1f-a0ea-46de-a456-f3f8b35bca8d" -> "cdc375a5-561f-48c9-a847-ccc73f1cc44c" [label="pre"];
    "8f6cfe1f-a0ea-46de-a456-f3f8b35bca8d" -> "4600fda0-b8a8-45cc-8174-d554de9b7e84" [label="presbyopic"];
    "c0b04ca3-692d-4498-8292-165ed4997ce5" -> "08a19fa5-952c-4ab3-a283-dbfe2e3e5870" [label="myope"];
}

================================================
FILE: decision_tree/lenses.py
================================================
#!/usr/bin/env python
# -*- coding: utf-8 -*-

from trees import DecisionTreeClassifier

lense_labels = ['age', 'prescript', 'astigmatic', 'tearRate']
X = []
Y = []

with open('lenses.txt', 'r') as f:
    for line in f:
        comps = line.strip().split('\t')
        X.append(comps[: -1])
        Y.append(comps[-1])

clf = DecisionTreeClassifier()
clf.create_tree(X, Y, lense_labels)



================================================
FILE: decision_tree/lenses.txt
================================================
young	myope	no	reduced	no lenses
young	myope	no	normal	soft
young	myope	yes	reduced	no lenses
young	myope	yes	normal	hard
young	hyper	no	reduced	no lenses
young	hyper	no	normal	soft
young	hyper	yes	reduced	no lenses
young	hyper	yes	normal	hard
pre	myope	no	reduced	no lenses
pre	myope	no	normal	soft
pre	myope	yes	reduced	no lenses
pre	myope	yes	normal	hard
pre	hyper	no	reduced	no lenses
pre	hyper	no	normal	soft
pre	hyper	yes	reduced	no lenses
pre	hyper	yes	normal	no lenses
presbyopic	myope	no	reduced	no lenses
presbyopic	myope	no	normal	no lenses
presbyopic	myope	yes	reduced	no lenses
presbyopic	myope	yes	normal	hard
presbyopic	hyper	no	reduced	no lenses
presbyopic	hyper	no	normal	soft
presbyopic	hyper	yes	reduced	no lenses
presbyopic	hyper	yes	normal	no lenses


================================================
FILE: decision_tree/sms_tree.dot
================================================
digraph decision_tree {
    "959b4c0c-1821-446d-94a1-c619c2decfcd" [label="call"];
    "18665160-b058-437f-9b2e-05df2eb55661" [label="to"];
    "2eb9860d-d241-45ca-85e6-cbd80fe2ebf7" [label="your"];
    "bcbcc17c-9e2a-4bd4-a039-6e51fde5f8fd" [label="areyouunique"];
    "ca091fc7-8a4e-4970-9ec3-485a4628ad29" [label="02073162414"];
    "aac20872-1aac-499d-b2b5-caf0ef56eff3" [label="ham"];
    "18aa8685-a6e8-4d76-bad5-ccea922bb14d" [label="spam"];
    "3f7f30b1-4dbb-4459-9f25-358ad3c6d50b" [label="spam"];
    "44d1f972-cd97-4636-b6e6-a389bf560656" [label="spam"];
    "7f3c8562-69b5-47a9-8ee4-898bd4b6b506" [label="i"];
    "a6f22325-8841-4a81-bc04-4e7485117aa1" [label="spam"];
    "c181fe42-fd3c-48db-968a-502f8dd462a4" [label="ldn"];
    "51b9477a-0326-4774-8622-24d1d869a283" [label="ham"];
    "16f6aecd-c675-4291-867c-6c64d27eb3fc" [label="spam"];
    "adb05303-813a-4fe0-bf98-c319eb70be48" [label="spam"];
    "959b4c0c-1821-446d-94a1-c619c2decfcd" -> "18665160-b058-437f-9b2e-05df2eb55661" [label="0"];
    "18665160-b058-437f-9b2e-05df2eb55661" -> "2eb9860d-d241-45ca-85e6-cbd80fe2ebf7" [label="0"];
    "2eb9860d-d241-45ca-85e6-cbd80fe2ebf7" -> "bcbcc17c-9e2a-4bd4-a039-6e51fde5f8fd" [label="0"];
    "bcbcc17c-9e2a-4bd4-a039-6e51fde5f8fd" -> "ca091fc7-8a4e-4970-9ec3-485a4628ad29" [label="0"];
    "ca091fc7-8a4e-4970-9ec3-485a4628ad29" -> "aac20872-1aac-499d-b2b5-caf0ef56eff3" [label="0"];
    "ca091fc7-8a4e-4970-9ec3-485a4628ad29" -> "18aa8685-a6e8-4d76-bad5-ccea922bb14d" [label="1"];
    "bcbcc17c-9e2a-4bd4-a039-6e51fde5f8fd" -> "3f7f30b1-4dbb-4459-9f25-358ad3c6d50b" [label="1"];
    "2eb9860d-d241-45ca-85e6-cbd80fe2ebf7" -> "44d1f972-cd97-4636-b6e6-a389bf560656" [label="1"];
    "18665160-b058-437f-9b2e-05df2eb55661" -> "7f3c8562-69b5-47a9-8ee4-898bd4b6b506" [label="1"];
    "7f3c8562-69b5-47a9-8ee4-898bd4b6b506" -> "a6f22325-8841-4a81-bc04-4e7485117aa1" [label="0"];
    "7f3c8562-69b5-47a9-8ee4-898bd4b6b506" -> "c181fe42-fd3c-48db-968a-502f8dd462a4" [label="1"];
    "c181fe42-fd3c-48db-968a-502f8dd462a4" -> "51b9477a-0326-4774-8622-24d1d869a283" [label="0"];
    "c181fe42-fd3c-48db-968a-502f8dd462a4" -> "16f6aecd-c675-4291-867c-6c64d27eb3fc" [label="1"];
    "959b4c0c-1821-446d-94a1-c619c2decfcd" -> "adb05303-813a-4fe0-bf98-c319eb70be48" [label="1"];
}

================================================
FILE: decision_tree/sms_tree.py
================================================
#!/usr/bin/env python
# -*- coding: utf-8 -*-

''' 通过垃圾短信数据训练朴素贝叶斯模型,并进行留存交叉验证
'''

import re
import random
import os

import numpy as np
import matplotlib.pyplot as plt

from trees import DecisionTreeClassifier

ENCODING = 'ISO-8859-1'
TRAIN_PERCENTAGE = 0.9
  
def get_doc_vector(words, vocabulary):
    ''' 根据词汇表将文档中的词条转换成文档向量

    :param words: 文档中的词条列表
    :type words: list of str

    :param vocabulary: 总的词汇列表
    :type vocabulary: list of str

    :return doc_vect: 用于贝叶斯分析的文档向量
    :type doc_vect: list of int
    '''
    doc_vect = [0]*len(vocabulary)

    for word in words:
        if word in vocabulary:
            idx = vocabulary.index(word)
            doc_vect[idx] = 1

    return doc_vect

def parse_line(line):
    ''' 解析数据集中的每一行返回词条向量和短信类型.
    '''
    cls = line.split(',')[-1].strip()
    content = ','.join(line.split(',')[: -1])
    word_vect = [word.lower() for word in re.split(r'\W+', content) if word]
    return word_vect, cls

def parse_file(filename):
    ''' 解析文件中的数据
    '''
    vocabulary, word_vects, classes = [], [], []
    with open(filename, 'r', encoding=ENCODING) as f:
        for line in f:
            if line:
                word_vect, cls = parse_line(line)
                vocabulary.extend(word_vect)
                word_vects.append(word_vect)
                classes.append(cls)
    vocabulary = list(set(vocabulary))

    return vocabulary, word_vects, classes

if '__main__' == __name__:
    clf = DecisionTreeClassifier()
    vocabulary, word_vects, classes = parse_file('english_big.txt')

    # 训练数据 & 测试数据
    ntest = int(len(classes)*(1-TRAIN_PERCENTAGE))

    test_word_vects = []
    test_classes = []
    for i in range(ntest):
        idx = random.randint(0, len(word_vects)-1)
        test_word_vects.append(word_vects.pop(idx))
        test_classes.append(classes.pop(idx))

    train_word_vects = word_vects
    train_classes = classes

    train_dataset = [get_doc_vector(words, vocabulary) for words in train_word_vects]

    # 生成决策树
    if not os.path.exists('sms_tree.pkl'):
        clf.create_tree(train_dataset, train_classes, vocabulary)
        clf.dump_tree('sms_tree.pkl')
    else:
        clf.load_tree('sms_tree.pkl')

    # 测试模型
    error = 0
    for test_word_vect, test_cls in zip(test_word_vects, test_classes):
        test_data = get_doc_vector(test_word_vect, vocabulary)
        pred_cls = clf.classify(test_data, feat_names=vocabulary)
        if test_cls != pred_cls:
            print('Predict: {} -- Actual: {}'.format(pred_cls, test_cls))
            error += 1

    print('Error Rate: {}'.format(error/len(test_classes)))



================================================
FILE: decision_tree/sms_tree_2.dot
================================================
digraph decision_tree {
    "8fbb40df-9b8c-4525-a34a-0ec254360649" [label="call"];
    "9c2cf1a6-e34b-4f3c-9cc0-17a4e20f12f7" [label="to"];
    "ef9e9738-2596-4bdf-a42d-28d7f5471ca7" [label="your"];
    "626c0a8b-c1fe-42d9-ad4f-b79e03cf82f7" [label="from"];
    "56bdce7c-b23c-4d52-a802-1c28377ec7f5" [label="explicit"];
    "ebe24cea-1310-40fe-b164-25032c942aec" [label="ham"];
    "1a56632b-860b-4ace-b604-59b9c3b06405" [label="spam"];
    "d7636d96-6f9e-4883-a581-c8919088cbf2" [label="spam"];
    "1d1933b4-12e1-41ea-b6c1-46f8bacb851c" [label="spam"];
    "ac8ca11e-10f5-4a3f-8c1d-e31933d74a8d" [label="when"];
    "00cb082b-b9c3-4417-9d25-209f2b4957c8" [label="spam"];
    "b7cc5eda-0d6a-4893-ba1d-78641ed8a949" [label="ham"];
    "577ef6a5-eb97-4dc1-9fae-741253db33aa" [label="dead"];
    "ae9e2b6c-1bdb-4cdb-aaea-01f0d3e138c6" [label="spam"];
    "6c303284-fb0a-44e7-b92d-dcae4ffd828d" [label="ham"];
    "8fbb40df-9b8c-4525-a34a-0ec254360649" -> "9c2cf1a6-e34b-4f3c-9cc0-17a4e20f12f7" [label="0"];
    "9c2cf1a6-e34b-4f3c-9cc0-17a4e20f12f7" -> "ef9e9738-2596-4bdf-a42d-28d7f5471ca7" [label="0"];
    "ef9e9738-2596-4bdf-a42d-28d7f5471ca7" -> "626c0a8b-c1fe-42d9-ad4f-b79e03cf82f7" [label="0"];
    "626c0a8b-c1fe-42d9-ad4f-b79e03cf82f7" -> "56bdce7c-b23c-4d52-a802-1c28377ec7f5" [label="0"];
    "56bdce7c-b23c-4d52-a802-1c28377ec7f5" -> "ebe24cea-1310-40fe-b164-25032c942aec" [label="0"];
    "56bdce7c-b23c-4d52-a802-1c28377ec7f5" -> "1a56632b-860b-4ace-b604-59b9c3b06405" [label="1"];
    "626c0a8b-c1fe-42d9-ad4f-b79e03cf82f7" -> "d7636d96-6f9e-4883-a581-c8919088cbf2" [label="1"];
    "ef9e9738-2596-4bdf-a42d-28d7f5471ca7" -> "1d1933b4-12e1-41ea-b6c1-46f8bacb851c" [label="1"];
    "9c2cf1a6-e34b-4f3c-9cc0-17a4e20f12f7" -> "ac8ca11e-10f5-4a3f-8c1d-e31933d74a8d" [label="1"];
    "ac8ca11e-10f5-4a3f-8c1d-e31933d74a8d" -> "00cb082b-b9c3-4417-9d25-209f2b4957c8" [label="0"];
    "ac8ca11e-10f5-4a3f-8c1d-e31933d74a8d" -> "b7cc5eda-0d6a-4893-ba1d-78641ed8a949" [label="1"];
    "8fbb40df-9b8c-4525-a34a-0ec254360649" -> "577ef6a5-eb97-4dc1-9fae-741253db33aa" [label="1"];
    "577ef6a5-eb97-4dc1-9fae-741253db33aa" -> "ae9e2b6c-1bdb-4cdb-aaea-01f0d3e138c6" [label="0"];
    "577ef6a5-eb97-4dc1-9fae-741253db33aa" -> "6c303284-fb0a-44e7-b92d-dcae4ffd828d" [label="1"];
}

================================================
FILE: decision_tree/trees.py
================================================
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Author: PytLab <shaozhengjiang@gmail.com>
# Date: 2017-07-07

import copy
import uuid
import pickle
from collections import defaultdict, namedtuple
from math import log2


class DecisionTreeClassifier(object):
    ''' 使用ID3算法划分数据集的决策树分类器
    '''

    @staticmethod
    def split_dataset(dataset, classes, feat_idx):
        ''' 根据某个特征以及特征值划分数据集

        :param dataset: 待划分的数据集, 有数据向量组成的列表.
        :param classes: 数据集对应的类型, 与数据集有相同的长度
        :param feat_idx: 特征在特征向量中的索引

        :param splited_dict: 保存分割后数据的字典 特征值: [子数据集, 子类型列表]
        '''
        splited_dict = {}
        for data_vect, cls in zip(dataset, classes):
            feat_val = data_vect[feat_idx]
            sub_dataset, sub_classes = splited_dict.setdefault(feat_val, [[], []])
            sub_dataset.append(data_vect[: feat_idx] + data_vect[feat_idx+1: ])
            sub_classes.append(cls)

        return splited_dict

    def get_shanno_entropy(self, values):
        ''' 根据给定列表中的值计算其Shanno Entropy
        '''
        uniq_vals = set(values)
        val_nums = {key: values.count(key) for key in uniq_vals}
        probs = [v/len(values) for k, v in val_nums.items()]
        entropy = sum([-prob*log2(prob) for prob in probs])
        return entropy

    def choose_best_split_feature(self, dataset, classes):
        ''' 根据信息增益确定最好的划分数据的特征

        :param dataset: 待划分的数据集
        :param classes: 数据集对应的类型

        :return: 划分数据的增益最大的属性索引
        '''
        base_entropy = self.get_shanno_entropy(classes)

        feat_num = len(dataset[0])
        entropy_gains = []
        for i in range(feat_num):
            splited_dict = self.split_dataset(dataset, classes, i)
            new_entropy = sum([
                len(sub_classes)/len(classes)*self.get_shanno_entropy(sub_classes)
                for _, (_, sub_classes) in splited_dict.items()
            ])
            entropy_gains.append(base_entropy - new_entropy)

        return entropy_gains.index(max(entropy_gains))

    def get_majority(classes):
        ''' 返回类型中占据大多数的类型
        '''
        cls_num = defaultdict(lambda: 0)
        for cls in classes:
            cls_num[cls] += 1

        return max(cls_num, key=cls_num.get)

    def create_tree(self, dataset, classes, feat_names):
        ''' 根据当前数据集递归创建决策树

        :param dataset: 数据集
        :param feat_names: 数据集中数据相应的特征名称
        :param classes: 数据集中数据相应的类型

        :param tree: 以字典形式返回决策树
        '''
        # 如果数据集中只有一种类型停止树分裂
        if len(set(classes)) == 1:
            return classes[0]

        # 如果遍历完所有特征,返回比例最多的类型
        if len(feat_names) == 0:
            return get_majority(classes)

        # 分裂创建新的子树
        tree = {}
        best_feat_idx = self.choose_best_split_feature(dataset, classes)
        feature = feat_names[best_feat_idx]
        tree[feature] = {}

        # 创建用于递归创建子树的子数据集
        sub_feat_names = feat_names[:]
        sub_feat_names.pop(best_feat_idx)

        splited_dict = self.split_dataset(dataset, classes, best_feat_idx)
        for feat_val, (sub_dataset, sub_classes) in splited_dict.items():
            tree[feature][feat_val] = self.create_tree(sub_dataset,
                                                       sub_classes,
                                                       sub_feat_names)
        self.tree = tree
        self.feat_names = feat_names

        return tree

    def get_nodes_edges(self, tree=None, root_node=None):
        ''' 返回树中所有节点和边
        '''
        Node = namedtuple('Node', ['id', 'label'])
        Edge = namedtuple('Edge', ['start', 'end', 'label'])

        if tree is None:
            tree = self.tree

        if type(tree) is not dict:
            return [], []

        nodes, edges = [], []

        if root_node is None:
            label = list(tree.keys())[0]
            root_node = Node._make([uuid.uuid4(), label])
            nodes.append(root_node)

        for edge_label, sub_tree in tree[root_node.label].items():
            node_label = list(sub_tree.keys())[0] if type(sub_tree) is dict else sub_tree
            sub_node = Node._make([uuid.uuid4(), node_label])
            nodes.append(sub_node)

            edge = Edge._make([root_node, sub_node, edge_label])
            edges.append(edge)

            sub_nodes, sub_edges = self.get_nodes_edges(sub_tree, root_node=sub_node)
            nodes.extend(sub_nodes)
            edges.extend(sub_edges)

        return nodes, edges

    def dotify(self, tree=None):
        ''' 获取树的Graphviz Dot文件的内容
        '''
        if tree is None:
            tree = self.tree

        content = 'digraph decision_tree {\n'
        nodes, edges = self.get_nodes_edges(tree)

        for node in nodes:
            content += '    "{}" [label="{}"];\n'.format(node.id, node.label)

        for edge in edges:
            start, label, end = edge.start, edge.label, edge.end
            content += '    "{}" -> "{}" [label="{}"];\n'.format(start.id, end.id, label)
        content += '}'

        return content

    def classify(self, data_vect, feat_names=None, tree=None):
        ''' 根据构建的决策树对数据进行分类
        '''
        if tree is None:
            tree = self.tree

        if feat_names is None:
            feat_names = self.feat_names

        # Recursive base case.
        if type(tree) is not dict:
            return tree

        feature = list(tree.keys())[0]
        value = data_vect[feat_names.index(feature)]
        sub_tree = tree[feature][value]

        return self.classify(data_vect, feat_names, sub_tree)

    def dump_tree(self, filename, tree=None):
        ''' 存储决策树
        '''
        if tree is None:
            tree = self.tree

        with open(filename, 'wb') as f:
            pickle.dump(tree, f)

    def load_tree(self, filename):
        ''' 加载树结构
        '''
        with open(filename, 'rb') as f:
            tree = pickle.load(f)
            self.tree = tree
        return tree



================================================
FILE: linear_regression/abalone.txt
================================================
1	0.455	0.365	0.095	0.514	0.2245	0.101	0.15	15
1	0.35	0.265	0.09	0.2255	0.0995	0.0485	0.07	7
-1	0.53	0.42	0.135	0.677	0.2565	0.1415	0.21	9
1	0.44	0.365	0.125	0.516	0.2155	0.114	0.155	10
0	0.33	0.255	0.08	0.205	0.0895	0.0395	0.055	7
0	0.425	0.3	0.095	0.3515	0.141	0.0775	0.12	8
-1	0.53	0.415	0.15	0.7775	0.237	0.1415	0.33	20
-1	0.545	0.425	0.125	0.768	0.294	0.1495	0.26	16
1	0.475	0.37	0.125	0.5095	0.2165	0.1125	0.165	9
-1	0.55	0.44	0.15	0.8945	0.3145	0.151	0.32	19
-1	0.525	0.38	0.14	0.6065	0.194	0.1475	0.21	14
1	0.43	0.35	0.11	0.406	0.1675	0.081	0.135	10
1	0.49	0.38	0.135	0.5415	0.2175	0.095	0.19	11
-1	0.535	0.405	0.145	0.6845	0.2725	0.171	0.205	10
-1	0.47	0.355	0.1	0.4755	0.1675	0.0805	0.185	10
1	0.5	0.4	0.13	0.6645	0.258	0.133	0.24	12
0	0.355	0.28	0.085	0.2905	0.095	0.0395	0.115	7
-1	0.44	0.34	0.1	0.451	0.188	0.087	0.13	10
1	0.365	0.295	0.08	0.2555	0.097	0.043	0.1	7
1	0.45	0.32	0.1	0.381	0.1705	0.075	0.115	9
1	0.355	0.28	0.095	0.2455	0.0955	0.062	0.075	11
0	0.38	0.275	0.1	0.2255	0.08	0.049	0.085	10
-1	0.565	0.44	0.155	0.9395	0.4275	0.214	0.27	12
-1	0.55	0.415	0.135	0.7635	0.318	0.21	0.2	9
-1	0.615	0.48	0.165	1.1615	0.513	0.301	0.305	10
-1	0.56	0.44	0.14	0.9285	0.3825	0.188	0.3	11
-1	0.58	0.45	0.185	0.9955	0.3945	0.272	0.285	11
1	0.59	0.445	0.14	0.931	0.356	0.234	0.28	12
1	0.605	0.475	0.18	0.9365	0.394	0.219	0.295	15
1	0.575	0.425	0.14	0.8635	0.393	0.227	0.2	11
1	0.58	0.47	0.165	0.9975	0.3935	0.242	0.33	10
-1	0.68	0.56	0.165	1.639	0.6055	0.2805	0.46	15
1	0.665	0.525	0.165	1.338	0.5515	0.3575	0.35	18
-1	0.68	0.55	0.175	1.798	0.815	0.3925	0.455	19
-1	0.705	0.55	0.2	1.7095	0.633	0.4115	0.49	13
1	0.465	0.355	0.105	0.4795	0.227	0.124	0.125	8
-1	0.54	0.475	0.155	1.217	0.5305	0.3075	0.34	16
-1	0.45	0.355	0.105	0.5225	0.237	0.1165	0.145	8
-1	0.575	0.445	0.135	0.883	0.381	0.2035	0.26	11
1	0.355	0.29	0.09	0.3275	0.134	0.086	0.09	9
-1	0.45	0.335	0.105	0.425	0.1865	0.091	0.115	9
-1	0.55	0.425	0.135	0.8515	0.362	0.196	0.27	14
0	0.24	0.175	0.045	0.07	0.0315	0.0235	0.02	5
0	0.205	0.15	0.055	0.042	0.0255	0.015	0.012	5
0	0.21	0.15	0.05	0.042	0.0175	0.0125	0.015	4
0	0.39	0.295	0.095	0.203	0.0875	0.045	0.075	7
1	0.47	0.37	0.12	0.5795	0.293	0.227	0.14	9
-1	0.46	0.375	0.12	0.4605	0.1775	0.11	0.15	7
0	0.325	0.245	0.07	0.161	0.0755	0.0255	0.045	6
-1	0.525	0.425	0.16	0.8355	0.3545	0.2135	0.245	9
0	0.52	0.41	0.12	0.595	0.2385	0.111	0.19	8
1	0.4	0.32	0.095	0.303	0.1335	0.06	0.1	7
1	0.485	0.36	0.13	0.5415	0.2595	0.096	0.16	10
-1	0.47	0.36	0.12	0.4775	0.2105	0.1055	0.15	10
1	0.405	0.31	0.1	0.385	0.173	0.0915	0.11	7
-1	0.5	0.4	0.14	0.6615	0.2565	0.1755	0.22	8
1	0.445	0.35	0.12	0.4425	0.192	0.0955	0.135	8
1	0.47	0.385	0.135	0.5895	0.2765	0.12	0.17	8
0	0.245	0.19	0.06	0.086	0.042	0.014	0.025	4
-1	0.505	0.4	0.125	0.583	0.246	0.13	0.175	7
1	0.45	0.345	0.105	0.4115	0.18	0.1125	0.135	7
1	0.505	0.405	0.11	0.625	0.305	0.16	0.175	9
-1	0.53	0.41	0.13	0.6965	0.302	0.1935	0.2	10
1	0.425	0.325	0.095	0.3785	0.1705	0.08	0.1	7
1	0.52	0.4	0.12	0.58	0.234	0.1315	0.185	8
1	0.475	0.355	0.12	0.48	0.234	0.1015	0.135	8
-1	0.565	0.44	0.16	0.915	0.354	0.1935	0.32	12
-1	0.595	0.495	0.185	1.285	0.416	0.224	0.485	13
-1	0.475	0.39	0.12	0.5305	0.2135	0.1155	0.17	10
0	0.31	0.235	0.07	0.151	0.063	0.0405	0.045	6
1	0.555	0.425	0.13	0.7665	0.264	0.168	0.275	13
-1	0.4	0.32	0.11	0.353	0.1405	0.0985	0.1	8
-1	0.595	0.475	0.17	1.247	0.48	0.225	0.425	20
1	0.57	0.48	0.175	1.185	0.474	0.261	0.38	11
-1	0.605	0.45	0.195	1.098	0.481	0.2895	0.315	13
-1	0.6	0.475	0.15	1.0075	0.4425	0.221	0.28	15
1	0.595	0.475	0.14	0.944	0.3625	0.189	0.315	9
-1	0.6	0.47	0.15	0.922	0.363	0.194	0.305	10
-1	0.555	0.425	0.14	0.788	0.282	0.1595	0.285	11
-1	0.615	0.475	0.17	1.1025	0.4695	0.2355	0.345	14
-1	0.575	0.445	0.14	0.941	0.3845	0.252	0.285	9
1	0.62	0.51	0.175	1.615	0.5105	0.192	0.675	12
-1	0.52	0.425	0.165	0.9885	0.396	0.225	0.32	16
1	0.595	0.475	0.16	1.3175	0.408	0.234	0.58	21
1	0.58	0.45	0.14	1.013	0.38	0.216	0.36	14
-1	0.57	0.465	0.18	1.295	0.339	0.2225	0.44	12
1	0.625	0.465	0.14	1.195	0.4825	0.205	0.4	13
1	0.56	0.44	0.16	0.8645	0.3305	0.2075	0.26	10
-1	0.46	0.355	0.13	0.517	0.2205	0.114	0.165	9
-1	0.575	0.45	0.16	0.9775	0.3135	0.231	0.33	12
1	0.565	0.425	0.135	0.8115	0.341	0.1675	0.255	15
1	0.555	0.44	0.15	0.755	0.307	0.1525	0.26	12
1	0.595	0.465	0.175	1.115	0.4015	0.254	0.39	13
-1	0.625	0.495	0.165	1.262	0.507	0.318	0.39	10
1	0.695	0.56	0.19	1.494	0.588	0.3425	0.485	15
1	0.665	0.535	0.195	1.606	0.5755	0.388	0.48	14
1	0.535	0.435	0.15	0.725	0.269	0.1385	0.25	9
1	0.47	0.375	0.13	0.523	0.214	0.132	0.145	8
1	0.47	0.37	0.13	0.5225	0.201	0.133	0.165	7
-1	0.475	0.375	0.125	0.5785	0.2775	0.085	0.155	10
0	0.36	0.265	0.095	0.2315	0.105	0.046	0.075	7
1	0.55	0.435	0.145	0.843	0.328	0.1915	0.255	15
1	0.53	0.435	0.16	0.883	0.316	0.164	0.335	15
1	0.53	0.415	0.14	0.724	0.3105	0.1675	0.205	10
1	0.605	0.47	0.16	1.1735	0.4975	0.2405	0.345	12
-1	0.52	0.41	0.155	0.727	0.291	0.1835	0.235	12
-1	0.545	0.43	0.165	0.802	0.2935	0.183	0.28	11
-1	0.5	0.4	0.125	0.6675	0.261	0.1315	0.22	10
-1	0.51	0.39	0.135	0.6335	0.231	0.179	0.2	9
-1	0.435	0.395	0.105	0.3635	0.136	0.098	0.13	9
1	0.495	0.395	0.125	0.5415	0.2375	0.1345	0.155	9
1	0.465	0.36	0.105	0.431	0.172	0.107	0.175	9
0	0.435	0.32	0.08	0.3325	0.1485	0.0635	0.105	9
1	0.425	0.35	0.105	0.393	0.13	0.063	0.165	9
-1	0.545	0.41	0.125	0.6935	0.2975	0.146	0.21	11
-1	0.53	0.415	0.115	0.5915	0.233	0.1585	0.18	11
-1	0.49	0.375	0.135	0.6125	0.2555	0.102	0.22	11
1	0.44	0.34	0.105	0.402	0.1305	0.0955	0.165	10
-1	0.56	0.43	0.15	0.8825	0.3465	0.172	0.31	9
1	0.405	0.305	0.085	0.2605	0.1145	0.0595	0.085	8
-1	0.47	0.365	0.105	0.4205	0.163	0.1035	0.14	9
0	0.385	0.295	0.085	0.2535	0.103	0.0575	0.085	7
-1	0.515	0.425	0.14	0.766	0.304	0.1725	0.255	14
1	0.37	0.265	0.075	0.214	0.09	0.051	0.07	6
0	0.36	0.28	0.08	0.1755	0.081	0.0505	0.07	6
0	0.27	0.195	0.06	0.073	0.0285	0.0235	0.03	5
0	0.375	0.275	0.09	0.238	0.1075	0.0545	0.07	6
0	0.385	0.29	0.085	0.2505	0.112	0.061	0.08	8
1	0.7	0.535	0.16	1.7255	0.63	0.2635	0.54	19
1	0.71	0.54	0.165	1.959	0.7665	0.261	0.78	18
1	0.595	0.48	0.165	1.262	0.4835	0.283	0.41	17
-1	0.44	0.35	0.125	0.4035	0.175	0.063	0.129	9
-1	0.325	0.26	0.09	0.1915	0.085	0.036	0.062	7
0	0.35	0.26	0.095	0.211	0.086	0.056	0.068	7
0	0.265	0.2	0.065	0.0975	0.04	0.0205	0.028	7
-1	0.425	0.33	0.115	0.406	0.1635	0.081	0.1355	8
-1	0.305	0.23	0.08	0.156	0.0675	0.0345	0.048	7
1	0.345	0.255	0.09	0.2005	0.094	0.0295	0.063	9
-1	0.405	0.325	0.11	0.3555	0.151	0.063	0.117	9
1	0.375	0.285	0.095	0.253	0.096	0.0575	0.0925	9
-1	0.565	0.445	0.155	0.826	0.341	0.2055	0.2475	10
-1	0.55	0.45	0.145	0.741	0.295	0.1435	0.2665	10
1	0.65	0.52	0.19	1.3445	0.519	0.306	0.4465	16
1	0.56	0.455	0.155	0.797	0.34	0.19	0.2425	11
1	0.475	0.375	0.13	0.5175	0.2075	0.1165	0.17	10
-1	0.49	0.38	0.125	0.549	0.245	0.1075	0.174	10
1	0.46	0.35	0.12	0.515	0.224	0.108	0.1565	10
0	0.28	0.205	0.08	0.127	0.052	0.039	0.042	9
0	0.175	0.13	0.055	0.0315	0.0105	0.0065	0.0125	5
0	0.17	0.13	0.095	0.03	0.013	0.008	0.01	4
1	0.59	0.475	0.145	1.053	0.4415	0.262	0.325	15
-1	0.605	0.5	0.185	1.1185	0.469	0.2585	0.335	9
-1	0.635	0.515	0.19	1.3715	0.5065	0.305	0.45	10
-1	0.605	0.485	0.16	1.0565	0.37	0.2355	0.355	10
-1	0.565	0.45	0.135	0.9885	0.387	0.1495	0.31	12
1	0.515	0.405	0.13	0.722	0.32	0.131	0.21	10
-1	0.575	0.46	0.19	0.994	0.392	0.2425	0.34	13
1	0.645	0.485	0.215	1.514	0.546	0.2615	0.
Download .txt
gitextract_2fv0thbp/

├── .gitignore
├── README.md
├── Reinforcement Learning/
│   ├── Calculating State Utilities.ipynb
│   ├── Calculating Transition Probabilities.ipynb
│   ├── Defining Initial Distribution.ipynb
│   ├── Policy Iteration Algorithm.ipynb
│   ├── T.npy
│   └── Value Iteration Algorithm.ipynb
├── classification_and_regression_trees/
│   ├── bikeSpeedVsIq_test.txt
│   ├── bikeSpeedVsIq_train.txt
│   ├── compare.py
│   ├── dot/
│   │   ├── ex0.dot
│   │   ├── ex00.dot
│   │   ├── ex2.dot
│   │   ├── ex2_prune.dot
│   │   └── exp2.dot
│   ├── ex0.txt
│   ├── ex00.txt
│   ├── ex2.dot
│   ├── ex2.txt
│   ├── ex2test.txt
│   ├── exp.txt
│   ├── exp2.dot
│   ├── exp2.txt
│   ├── model_tree.py
│   ├── notebook/
│   │   ├── 分段函数回归树.ipynb
│   │   ├── 后剪枝.ipynb
│   │   └── 模型树对分段线性函数进行回归.ipynb
│   ├── prune.py
│   └── regression_tree.py
├── decision_tree/
│   ├── english_big.txt
│   ├── lenses.dot
│   ├── lenses.py
│   ├── lenses.txt
│   ├── sms_tree.dot
│   ├── sms_tree.pkl
│   ├── sms_tree.py
│   ├── sms_tree_2.dot
│   └── trees.py
├── linear_regression/
│   ├── abalone.txt
│   ├── ex0.txt
│   ├── ex1.txt
│   ├── lasso_regression.ipynb
│   ├── lasso_regression.py
│   ├── lasso_traj.ipynb
│   ├── lasso_ws
│   ├── local_weighted_linear_regression.py
│   ├── ridge_regression.ipynb
│   ├── ridge_regression.py
│   ├── stage_wise_regression.py
│   ├── stage_wise_traj.ipynb
│   └── standard_linear_regression.py
├── logistic_regression/
│   ├── english_big.txt
│   ├── logreg_grad_ascent.py
│   ├── logreg_stoch_grad_ascent.py
│   ├── sms.py
│   └── testSet.txt
├── naive_bayes/
│   ├── bayes.py
│   ├── english_big.txt
│   └── sms.py
└── support_vector_machine/
    ├── best_fit.py
    ├── svm_ga.py
    ├── svm_platt_smo.py
    ├── svm_simple_smo.py
    └── testSet.txt
Download .txt
SYMBOL INDEX (82 symbols across 19 files)

FILE: classification_and_regression_trees/compare.py
  function get_corrcoef (line 7) | def get_corrcoef(X, Y):

FILE: classification_and_regression_trees/model_tree.py
  function linear_regression (line 6) | def linear_regression(dataset):
  function fleaf (line 21) | def fleaf(dataset):
  function ferr (line 27) | def ferr(dataset):
  function get_nodes_edges (line 34) | def get_nodes_edges(tree, root_node=None):
  function dotify (line 67) | def dotify(tree):
  function tree_predict (line 83) | def tree_predict(data, tree):

FILE: classification_and_regression_trees/prune.py
  function not_tree (line 6) | def not_tree(tree):
  function collapse (line 11) | def collapse(tree):
  function postprune (line 19) | def postprune(tree, test_data):

FILE: classification_and_regression_trees/regression_tree.py
  function load_data (line 14) | def load_data(filename):
  function split_dataset (line 24) | def split_dataset(dataset, feat_idx, value):
  function create_tree (line 35) | def create_tree(dataset, fleaf, ferr, opt=None):
  function fleaf (line 67) | def fleaf(dataset):
  function ferr (line 73) | def ferr(dataset):
  function choose_best_feature (line 80) | def choose_best_feature(dataset, fleaf, ferr, opt):
  function get_nodes_edges (line 126) | def get_nodes_edges(tree, root_node=None):
  function dotify (line 159) | def dotify(tree):
  function tree_predict (line 175) | def tree_predict(data, tree):

FILE: decision_tree/sms_tree.py
  function get_doc_vector (line 19) | def get_doc_vector(words, vocabulary):
  function parse_line (line 40) | def parse_line(line):
  function parse_file (line 48) | def parse_file(filename):

FILE: decision_tree/trees.py
  class DecisionTreeClassifier (line 13) | class DecisionTreeClassifier(object):
    method split_dataset (line 18) | def split_dataset(dataset, classes, feat_idx):
    method get_shanno_entropy (line 36) | def get_shanno_entropy(self, values):
    method choose_best_split_feature (line 45) | def choose_best_split_feature(self, dataset, classes):
    method get_majority (line 67) | def get_majority(classes):
    method create_tree (line 76) | def create_tree(self, dataset, classes, feat_names):
    method get_nodes_edges (line 113) | def get_nodes_edges(self, tree=None, root_node=None):
    method dotify (line 146) | def dotify(self, tree=None):
    method classify (line 165) | def classify(self, data_vect, feat_names=None, tree=None):
    method dump_tree (line 184) | def dump_tree(self, filename, tree=None):
    method load_tree (line 193) | def load_tree(self, filename):

FILE: linear_regression/lasso_regression.py
  function lasso_regression (line 13) | def lasso_regression(X, y, lambd=0.2, threshold=0.1):
  function lasso_traj (line 54) | def lasso_traj(X, y, ntest=30):

FILE: linear_regression/local_weighted_linear_regression.py
  function lwlr (line 11) | def lwlr(x, X, Y, k):

FILE: linear_regression/ridge_regression.py
  function ridge_regression (line 11) | def ridge_regression(X, y, lambd=0.2):
  function ridge_traj (line 20) | def ridge_traj(X, y, ntest=30):

FILE: linear_regression/stage_wise_regression.py
  function stagewise_regression (line 10) | def stagewise_regression(X, y, eps=0.01, niter=100):

FILE: linear_regression/standard_linear_regression.py
  function load_data (line 8) | def load_data(filename):
  function standarize (line 21) | def standarize(X):
  function std_linreg (line 28) | def std_linreg(X, Y):
  function get_corrcoef (line 35) | def get_corrcoef(X, Y):

FILE: logistic_regression/logreg_grad_ascent.py
  class LogisticRegressionClassifier (line 10) | class LogisticRegressionClassifier(object):
    method sigmoid (line 15) | def sigmoid(x):
    method gradient_ascent (line 20) | def gradient_ascent(self, dataset, labels, max_iter=10000):
    method classify (line 44) | def classify(self, data, w=None):
  function load_data (line 54) | def load_data(filename):
  function snapshot (line 67) | def snapshot(w, dataset, labels, pic_name):

FILE: logistic_regression/logreg_stoch_grad_ascent.py
  class LogisticRegressionClassifier (line 12) | class LogisticRegressionClassifier(BaseClassifer):
    method stoch_gradient_ascent (line 14) | def stoch_gradient_ascent(self, dataset, labels, max_iter=150):

FILE: logistic_regression/sms.py
  function get_doc_vector (line 18) | def get_doc_vector(words, vocabulary):
  function parse_line (line 39) | def parse_line(line):
  function parse_file (line 47) | def parse_file(filename):

FILE: naive_bayes/bayes.py
  class NaiveBayesClassifier (line 8) | class NaiveBayesClassifier(object):
    method train (line 12) | def train(self, dataset, classes):
    method classify (line 49) | def classify(self, doc_vect, cond_probs, cls_probs):

FILE: naive_bayes/sms.py
  function get_doc_vector (line 18) | def get_doc_vector(words, vocabulary):
  function parse_line (line 39) | def parse_line(line):
  function parse_file (line 47) | def parse_file(filename):

FILE: support_vector_machine/svm_ga.py
  function load_data (line 22) | def load_data(filename):
  function get_w (line 31) | def get_w(alphas, dataset, labels):
  function fitness (line 59) | def fitness(indv):

FILE: support_vector_machine/svm_platt_smo.py
  class SVMUtil (line 10) | class SVMUtil(object):
    method __init__ (line 14) | def __init__(self, dataset, labels, C, tolerance=0.001):
    method f (line 24) | def f(self, x):
    method get_error (line 38) | def get_error(self, i):
    method update_errors (line 45) | def update_errors(self):
    method meet_kkt (line 50) | def meet_kkt(self, i):
  function load_data (line 61) | def load_data(filename):
  function clip (line 70) | def clip(alpha, L, H):
  function select_j_rand (line 80) | def select_j_rand(i, m):
  function select_j (line 87) | def select_j(i, svm_util):
  function get_w (line 107) | def get_w(alphas, dataset, labels):
  function take_step (line 116) | def take_step(i, j, svm_util):
  function examine_example (line 169) | def examine_example(i, svm_util):
  function platt_smo (line 184) | def platt_smo(dataset, labels, C, max_iter):

FILE: support_vector_machine/svm_simple_smo.py
  function load_data (line 10) | def load_data(filename):
  function clip (line 19) | def clip(alpha, L, H):
  function select_j (line 29) | def select_j(i, m):
  function get_w (line 36) | def get_w(alphas, dataset, labels):
  function simple_smo (line 45) | def simple_smo(dataset, labels, C, max_iter):
Condensed preview — 65 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (922K chars).
[
  {
    "path": ".gitignore",
    "chars": 1175,
    "preview": "# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packagi"
  },
  {
    "path": "README.md",
    "chars": 2288,
    "preview": "# MLBox\nMachine Learning Algorithms implementations\n\n# Blogs\n- [机器学习算法实践-决策树(Decision Tree)](http://pytlab.github.io/201"
  },
  {
    "path": "Reinforcement Learning/Calculating State Utilities.ipynb",
    "chars": 3685,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"A MDP is a reinterpretation of Mark"
  },
  {
    "path": "Reinforcement Learning/Calculating Transition Probabilities.ipynb",
    "chars": 2535,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \" 1. Set of possible states : S  = {"
  },
  {
    "path": "Reinforcement Learning/Defining Initial Distribution.ipynb",
    "chars": 2926,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"Let us now define the initial distr"
  },
  {
    "path": "Reinforcement Learning/Policy Iteration Algorithm.ipynb",
    "chars": 8857,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"Policy iteration is guaranteed to c"
  },
  {
    "path": "Reinforcement Learning/Value Iteration Algorithm.ipynb",
    "chars": 4695,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"Value Iteration algorithm uses the "
  },
  {
    "path": "classification_and_regression_trees/bikeSpeedVsIq_test.txt",
    "chars": 4235,
    "preview": "12.000000\t121.010516\r\n19.000000\t157.337044\r\n12.000000\t116.031825\r\n15.000000\t132.124872\r\n2.000000\t52.719612\r\n6.000000\t39."
  },
  {
    "path": "classification_and_regression_trees/bikeSpeedVsIq_train.txt",
    "chars": 4220,
    "preview": "3.000000\t46.852122\r\n23.000000\t178.676107\r\n0.000000\t86.154024\r\n6.000000\t68.707614\r\n15.000000\t139.737693\r\n17.000000\t141.98"
  },
  {
    "path": "classification_and_regression_trees/compare.py",
    "chars": 1556,
    "preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nfrom regression_tree import *\nfrom model_tree import linear_regression\n\nd"
  },
  {
    "path": "classification_and_regression_trees/dot/ex0.dot",
    "chars": 1273,
    "preview": "digraph decision_tree {\n    \"5db27cbb-29af-4987-9cd2-9217c781000d\" [label=\"0: 0.400158\"];\n    \"a81daf61-ab07-4e65-8b8a-5"
  },
  {
    "path": "classification_and_regression_trees/dot/ex00.dot",
    "chars": 381,
    "preview": "digraph decision_tree {\n    \"ccd352d8-dbf6-4f59-ae0b-c983f39e5c87\" [label=\"0: 0.50794\"];\n    \"46052817-27f4-4748-8f02-43"
  },
  {
    "path": "classification_and_regression_trees/dot/ex2.dot",
    "chars": 12311,
    "preview": "digraph decision_tree {\n    \"bdbe6f68-a446-4539-8a80-860f22663afe\" [label=\"0: 0.508542\"];\n    \"acef94b2-b18f-4c9c-bb21-4"
  },
  {
    "path": "classification_and_regression_trees/dot/ex2_prune.dot",
    "chars": 9923,
    "preview": "digraph decision_tree {\n    \"c4bff19d-b75d-4b50-99e8-34f696a77644\" [label=\"0: 0.508542\"];\n    \"68b83894-3568-462c-a8c2-a"
  },
  {
    "path": "classification_and_regression_trees/dot/exp2.dot",
    "chars": 456,
    "preview": "digraph decision_tree {\n    \"5c49cf77-b404-459e-b4fd-513a927807dc\" [label=\"0: 0.304401\"];\n    \"83d1a5dd-ca47-4f50-845b-3"
  },
  {
    "path": "classification_and_regression_trees/ex0.txt",
    "chars": 3821,
    "preview": "0.409175\t1.883180\r\n0.182603\t0.063908\r\n0.663687\t3.042257\r\n0.517395\t2.305004\r\n0.013643\t-0.067698\r\n0.469643\t1.662809\r\n0.725"
  },
  {
    "path": "classification_and_regression_trees/ex00.txt",
    "chars": 3846,
    "preview": "0.036098\t0.155096\r\n0.993349\t1.077553\r\n0.530897\t0.893462\r\n0.712386\t0.564858\r\n0.343554\t-0.371700\r\n0.098016\t-0.332760\r\n0.69"
  },
  {
    "path": "classification_and_regression_trees/ex2.dot",
    "chars": 682,
    "preview": "digraph decision_tree {\n    \"e1b05249-eb8e-4afd-837c-d2f5a5299a6a\" [label=\"0: 0.508542\"];\n    \"b82d5e44-41de-40ec-8558-f"
  },
  {
    "path": "classification_and_regression_trees/ex2.txt",
    "chars": 4069,
    "preview": "0.228628\t-2.266273\r\n0.965969\t112.386764\r\n0.342761\t-31.584855\r\n0.901444\t87.300625\r\n0.585413\t125.295113\r\n0.334900\t18.97665"
  },
  {
    "path": "classification_and_regression_trees/ex2test.txt",
    "chars": 4064,
    "preview": "0.421862\t10.830241\r\n0.105349\t-2.241611\r\n0.155196\t21.872976\r\n0.161152\t2.015418\r\n0.382632\t-38.778979\r\n0.017710\t20.109113\r\n"
  },
  {
    "path": "classification_and_regression_trees/exp.txt",
    "chars": 3998,
    "preview": "0.529582\t100.737303\r\n0.985730\t103.106872\r\n0.797869\t99.666151\r\n0.393473\t-1.773056\r\n0.272568\t-1.170222\r\n0.758825\t96.752440"
  },
  {
    "path": "classification_and_regression_trees/exp2.dot",
    "chars": 456,
    "preview": "digraph decision_tree {\n    \"c830d5ff-5d25-4637-a268-2bb63f5d4351\" [label=\"0: 0.304401\"];\n    \"44889deb-3d44-405b-a7cf-d"
  },
  {
    "path": "classification_and_regression_trees/exp2.txt",
    "chars": 3831,
    "preview": "0.070670\t3.470829\r\n0.534076\t6.377132\r\n0.747221\t8.949407\r\n0.668970\t8.034081\r\n0.586082\t6.997721\r\n0.764962\t9.318110\r\n0.6581"
  },
  {
    "path": "classification_and_regression_trees/model_tree.py",
    "chars": 2904,
    "preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nfrom regression_tree import *\n\ndef linear_regression(dataset):\n    ''' 获取"
  },
  {
    "path": "classification_and_regression_trees/notebook/分段函数回归树.ipynb",
    "chars": 22802,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 10,\n   \"metadata\": {\n    \"scrolled\": true\n   },\n   \"outp"
  },
  {
    "path": "classification_and_regression_trees/notebook/后剪枝.ipynb",
    "chars": 14241,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n "
  },
  {
    "path": "classification_and_regression_trees/notebook/模型树对分段线性函数进行回归.ipynb",
    "chars": 23208,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n "
  },
  {
    "path": "classification_and_regression_trees/prune.py",
    "chars": 1237,
    "preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nfrom regression_tree import *\n\ndef not_tree(tree):\n    ''' 判断是否不是一棵树结构\n  "
  },
  {
    "path": "classification_and_regression_trees/regression_tree.py",
    "chars": 5388,
    "preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\n''' 回归树实现\n'''\n\nimport uuid\nfrom functools import namedtuple\n\nimport numpy"
  },
  {
    "path": "decision_tree/english_big.txt",
    "chars": 15139,
    "preview": "Urgent! call 09061749602 from Landline. Your complimentary 4* Tenerife Holiday or 10,000 cash await collection SAE T&Cs "
  },
  {
    "path": "decision_tree/lenses.dot",
    "chars": 2380,
    "preview": "digraph decision_tree {\n    \"99d3b650-7557-420c-be5f-037403909eef\" [label=\"tearRate\"];\n    \"ccf5c62e-14ca-4cef-9525-4b8f"
  },
  {
    "path": "decision_tree/lenses.py",
    "chars": 388,
    "preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nfrom trees import DecisionTreeClassifier\n\nlense_labels = ['age', 'prescri"
  },
  {
    "path": "decision_tree/lenses.txt",
    "chars": 795,
    "preview": "young\tmyope\tno\treduced\tno lenses\r\nyoung\tmyope\tno\tnormal\tsoft\r\nyoung\tmyope\tyes\treduced\tno lenses\r\nyoung\tmyope\tyes\tnormal\t"
  },
  {
    "path": "decision_tree/sms_tree.dot",
    "chars": 2289,
    "preview": "digraph decision_tree {\n    \"959b4c0c-1821-446d-94a1-c619c2decfcd\" [label=\"call\"];\n    \"18665160-b058-437f-9b2e-05df2eb5"
  },
  {
    "path": "decision_tree/sms_tree.py",
    "chars": 2619,
    "preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\n''' 通过垃圾短信数据训练朴素贝叶斯模型,并进行留存交叉验证\n'''\n\nimport re\nimport random\nimport os\n\ni"
  },
  {
    "path": "decision_tree/sms_tree_2.dot",
    "chars": 2281,
    "preview": "digraph decision_tree {\n    \"8fbb40df-9b8c-4525-a34a-0ec254360649\" [label=\"call\"];\n    \"9c2cf1a6-e34b-4f3c-9cc0-17a4e20f"
  },
  {
    "path": "decision_tree/trees.py",
    "chars": 5930,
    "preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n# Author: PytLab <shaozhengjiang@gmail.com>\n# Date: 2017-07-07\n\nimport cop"
  },
  {
    "path": "linear_regression/abalone.txt",
    "chars": 197357,
    "preview": "1\t0.455\t0.365\t0.095\t0.514\t0.2245\t0.101\t0.15\t15\r\n1\t0.35\t0.265\t0.09\t0.2255\t0.0995\t0.0485\t0.07\t7\r\n-1\t0.53\t0.42\t0.135\t0.677\t"
  },
  {
    "path": "linear_regression/ex0.txt",
    "chars": 5600,
    "preview": "1.000000\t0.067732\t3.176513\r\n1.000000\t0.427810\t3.816464\r\n1.000000\t0.995731\t4.550095\r\n1.000000\t0.738336\t4.256571\r\n1.000000"
  },
  {
    "path": "linear_regression/ex1.txt",
    "chars": 5600,
    "preview": "1.000000\t0.635975\t4.093119\r\n1.000000\t0.552438\t3.804358\r\n1.000000\t0.855922\t4.456531\r\n1.000000\t0.083386\t3.187049\r\n1.000000"
  },
  {
    "path": "linear_regression/lasso_regression.ipynb",
    "chars": 8874,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n "
  },
  {
    "path": "linear_regression/lasso_regression.py",
    "chars": 2085,
    "preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nimport itertools\nfrom math import exp\n\nimport numpy as np\nimport matplotl"
  },
  {
    "path": "linear_regression/lasso_traj.ipynb",
    "chars": 20630,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n "
  },
  {
    "path": "linear_regression/local_weighted_linear_regression.py",
    "chars": 1265,
    "preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nfrom math import exp\n\nimport numpy as np\nimport matplotlib.pyplot as plt\n"
  },
  {
    "path": "linear_regression/ridge_regression.ipynb",
    "chars": 19902,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n "
  },
  {
    "path": "linear_regression/ridge_regression.py",
    "chars": 1689,
    "preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nfrom math import exp\n\nimport numpy as np\nimport matplotlib.pyplot as plt\n"
  },
  {
    "path": "linear_regression/stage_wise_regression.py",
    "chars": 1486,
    "preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nimport numpy as np\nimport matplotlib.pyplot as plt\n\nfrom standard_linear_"
  },
  {
    "path": "linear_regression/stage_wise_traj.ipynb",
    "chars": 101517,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n "
  },
  {
    "path": "linear_regression/standard_linear_regression.py",
    "chars": 1625,
    "preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nimport numpy as np\nimport matplotlib.pyplot as plt\n\n\ndef load_data(filena"
  },
  {
    "path": "logistic_regression/english_big.txt",
    "chars": 114216,
    "preview": "Urgent! call 09061749602 from Landline. Your complimentary 4* Tenerife Holiday or 10,000 cash await collection SAE T&Cs "
  },
  {
    "path": "logistic_regression/logreg_grad_ascent.py",
    "chars": 2980,
    "preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nimport os\nfrom math import exp\n\nimport numpy as np\nimport matplotlib.pypl"
  },
  {
    "path": "logistic_regression/logreg_stoch_grad_ascent.py",
    "chars": 1629,
    "preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nimport random\n\nimport numpy as np\nimport matplotlib.pyplot as plt\n\nfrom l"
  },
  {
    "path": "logistic_regression/sms.py",
    "chars": 2641,
    "preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\n''' 通过垃圾短信数据训练Logistic回归模型,并进行留存交叉验证\n'''\n\nimport re\nimport random\n\nimport"
  },
  {
    "path": "logistic_regression/testSet.txt",
    "chars": 2187,
    "preview": "-0.017612\t14.053064\t0\r\n-1.395634\t4.662541\t1\r\n-0.752157\t6.538620\t0\r\n-1.322371\t7.152853\t0\r\n0.423363\t11.054677\t0\r\n0.406704\t"
  },
  {
    "path": "naive_bayes/bayes.py",
    "chars": 1627,
    "preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nfrom collections import defaultdict\n\nimport numpy as np\n\nclass NaiveBayes"
  },
  {
    "path": "naive_bayes/english_big.txt",
    "chars": 114216,
    "preview": "Urgent! call 09061749602 from Landline. Your complimentary 4* Tenerife Holiday or 10,000 cash await collection SAE T&Cs "
  },
  {
    "path": "naive_bayes/sms.py",
    "chars": 2776,
    "preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\n''' 通过垃圾短信数据训练朴素贝叶斯模型,并进行留存交叉验证\n'''\n\nimport re\nimport random\n\nimport nump"
  },
  {
    "path": "support_vector_machine/best_fit.py",
    "chars": 28091,
    "preview": "best_fit = [\n    (0, [0.9643380556912553, -0.14557889594528595, -5.0], 0.4416388939912057),\n    (1, [0.8451392281387395,"
  },
  {
    "path": "support_vector_machine/svm_ga.py",
    "chars": 2669,
    "preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\n''' 使用遗传算法框架GAFT优化SVM.\n\nGAFT项目地址: https://github.com/PytLab/gaft\n'''\n\nimp"
  },
  {
    "path": "support_vector_machine/svm_platt_smo.py",
    "chars": 7053,
    "preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nimport random\n\nimport numpy as np\nimport matplotlib.pyplot as plt\n\n\nclass"
  },
  {
    "path": "support_vector_machine/svm_simple_smo.py",
    "chars": 4699,
    "preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nimport random\n\nimport numpy as np\nimport matplotlib.pyplot as plt\n\n\ndef l"
  },
  {
    "path": "support_vector_machine/testSet.txt",
    "chars": 2208,
    "preview": "3.542485\t1.977398\t-1\r\n3.018896\t2.556416\t-1\r\n7.551510\t-1.580030\t1\r\n2.114999\t-0.004466\t-1\r\n8.127113\t1.274372\t1\r\n7.108772\t-"
  }
]

// ... and 3 more files (download for full content)

About this extraction

This page contains the full source code of the PytLab/MLBox GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 65 files (821.8 KB), approximately 423.5k tokens, and a symbol index with 82 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!