Showing preview only (858K chars total). Download the full file or copy to clipboard to get everything.
Repository: PytLab/MLBox
Branch: master
Commit: e916cd8ff9c3
Files: 65
Total size: 821.8 KB
Directory structure:
gitextract_2fv0thbp/
├── .gitignore
├── README.md
├── Reinforcement Learning/
│ ├── Calculating State Utilities.ipynb
│ ├── Calculating Transition Probabilities.ipynb
│ ├── Defining Initial Distribution.ipynb
│ ├── Policy Iteration Algorithm.ipynb
│ ├── T.npy
│ └── Value Iteration Algorithm.ipynb
├── classification_and_regression_trees/
│ ├── bikeSpeedVsIq_test.txt
│ ├── bikeSpeedVsIq_train.txt
│ ├── compare.py
│ ├── dot/
│ │ ├── ex0.dot
│ │ ├── ex00.dot
│ │ ├── ex2.dot
│ │ ├── ex2_prune.dot
│ │ └── exp2.dot
│ ├── ex0.txt
│ ├── ex00.txt
│ ├── ex2.dot
│ ├── ex2.txt
│ ├── ex2test.txt
│ ├── exp.txt
│ ├── exp2.dot
│ ├── exp2.txt
│ ├── model_tree.py
│ ├── notebook/
│ │ ├── 分段函数回归树.ipynb
│ │ ├── 后剪枝.ipynb
│ │ └── 模型树对分段线性函数进行回归.ipynb
│ ├── prune.py
│ └── regression_tree.py
├── decision_tree/
│ ├── english_big.txt
│ ├── lenses.dot
│ ├── lenses.py
│ ├── lenses.txt
│ ├── sms_tree.dot
│ ├── sms_tree.pkl
│ ├── sms_tree.py
│ ├── sms_tree_2.dot
│ └── trees.py
├── linear_regression/
│ ├── abalone.txt
│ ├── ex0.txt
│ ├── ex1.txt
│ ├── lasso_regression.ipynb
│ ├── lasso_regression.py
│ ├── lasso_traj.ipynb
│ ├── lasso_ws
│ ├── local_weighted_linear_regression.py
│ ├── ridge_regression.ipynb
│ ├── ridge_regression.py
│ ├── stage_wise_regression.py
│ ├── stage_wise_traj.ipynb
│ └── standard_linear_regression.py
├── logistic_regression/
│ ├── english_big.txt
│ ├── logreg_grad_ascent.py
│ ├── logreg_stoch_grad_ascent.py
│ ├── sms.py
│ └── testSet.txt
├── naive_bayes/
│ ├── bayes.py
│ ├── english_big.txt
│ └── sms.py
└── support_vector_machine/
├── best_fit.py
├── svm_ga.py
├── svm_platt_smo.py
├── svm_simple_smo.py
└── testSet.txt
================================================
FILE CONTENTS
================================================
================================================
FILE: .gitignore
================================================
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
target/
# Jupyter Notebook
.ipynb_checkpoints
# pyenv
.python-version
# celery beat schedule file
celerybeat-schedule
# SageMath parsed files
*.sage.py
# dotenv
.env
# virtualenv
.venv
venv/
ENV/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.DS_Store
*.swp
================================================
FILE: README.md
================================================
# MLBox
Machine Learning Algorithms implementations
# Blogs
- [机器学习算法实践-决策树(Decision Tree)](http://pytlab.github.io/2017/07/09/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E7%AE%97%E6%B3%95%E5%AE%9E%E8%B7%B5-%E5%86%B3%E7%AD%96%E6%A0%91/)
- [机器学习算法实践-朴素贝叶斯(Naive Bayes)](http://pytlab.github.io/2017/07/11/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E5%AE%9E%E8%B7%B5-%E6%9C%B4%E7%B4%A0%E8%B4%9D%E5%8F%B6%E6%96%AF-Naive-Bayes/)
- [机器学习算法实践-Logistic回归与梯度上升算法(上)](http://pytlab.github.io/2017/07/13/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E7%AE%97%E6%B3%95%E5%AE%9E%E8%B7%B5-Logistic%E5%9B%9E%E5%BD%92%E4%B8%8E%E6%A2%AF%E5%BA%A6%E4%B8%8A%E5%8D%87%E7%AE%97%E6%B3%95-%E4%B8%8A/)
- [机器学习算法实践-Logistic回归与梯度上升算法(下)](http://pytlab.github.io/2017/07/15/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E7%AE%97%E6%B3%95%E5%AE%9E%E8%B7%B5-Logistic%E5%9B%9E%E5%BD%92%E4%B8%8E%E6%A2%AF%E5%BA%A6%E4%B8%8A%E5%8D%87%E7%AE%97%E6%B3%95-%E4%B8%8B/)
- [机器学习算法实践-支持向量机(SVM)算法原理](http://pytlab.github.io/2017/08/15/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E7%AE%97%E6%B3%95%E5%AE%9E%E8%B7%B5-%E6%94%AF%E6%8C%81%E5%90%91%E9%87%8F%E6%9C%BA-SVM-%E7%AE%97%E6%B3%95%E5%8E%9F%E7%90%86/)
- [机器学习算法实践-SVM核函数和软间隔](http://pytlab.github.io/2017/08/30/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E7%AE%97%E6%B3%95%E5%AE%9E%E8%B7%B5-SVM%E6%A0%B8%E5%87%BD%E6%95%B0%E5%92%8C%E8%BD%AF%E9%97%B4%E9%9A%94/)
- [机器学习算法实践-SVM中的SMO算法](http://pytlab.github.io/2017/09/01/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E7%AE%97%E6%B3%95%E5%AE%9E%E8%B7%B5-SVM%E4%B8%AD%E7%9A%84SMO%E7%AE%97%E6%B3%95/)
- [机器学习算法实践-Platt SMO和遗传算法优化SVM](http://pytlab.github.io/2017/10/15/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E7%AE%97%E6%B3%95%E5%AE%9E%E8%B7%B5-Platt-SMO%E5%92%8C%E9%81%97%E4%BC%A0%E7%AE%97%E6%B3%95%E4%BC%98%E5%8C%96SVM/)
- [机器学习算法实践-标准与局部加权线性回归](http://pytlab.github.io/2017/10/24/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E7%AE%97%E6%B3%95%E5%AE%9E%E8%B7%B5-%E6%A0%87%E5%87%86%E4%B8%8E%E5%B1%80%E9%83%A8%E5%8A%A0%E6%9D%83%E7%BA%BF%E6%80%A7%E5%9B%9E%E5%BD%92/)
- [机器学习算法实践-岭回归和LASSO](http://pytlab.github.io/2017/10/27/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E5%AE%9E%E8%B7%B5-%E5%B2%AD%E5%9B%9E%E5%BD%92%E5%92%8CLASSO%E5%9B%9E%E5%BD%92/)
- [机器学习算法实践-树回归](http://pytlab.github.io/2017/11/03/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E7%AE%97%E6%B3%95%E5%AE%9E%E8%B7%B5-%E6%A0%91%E5%9B%9E%E5%BD%92/)
================================================
FILE: Reinforcement Learning/Calculating State Utilities.ipynb
================================================
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A MDP is a reinterpretation of Markov chains which includes an agent and a decision making process. A MDP is defined by these components:\n",
"1. Set of possible States: S={s0,s1,...,sm}\n",
"2. Initial State:s0\n",
"3. Set of possible Actions:A={a0,a1,...,an}\n",
"4. Transition Model:T(s,a,s′)\n",
"5. Reward Function: R(s)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We are going to implement MDP in a grid world of 3 x 4 space where our agent/robot is situated at (1,1) in the beginning and needs to reach (3,4) state which is its desired goal state. There is also a fault state at (2,4) which the robot needs to avoid at all costs. The movement of the robot from one state to another earns it a reward. Naturally, the reward for the goal state is the highest and the least for the fault state. The objective of the robot is to maximize its reward and thus plan its movements/actions accordingly. It can move in any direction and this is a stochastic process."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To compare the states, we calculate the utility of these states and this is shown below:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"def state_utility(v, T, u, reward, gamma):\n",
" \n",
" #v is the state vector\n",
" #T is the transition matrix\n",
" #u is the utility vector\n",
" #reward consists of the rewards earned for moving to a particular state\n",
" #gamma is the discount factor by which rewards are discounted over the time\n",
"\n",
" action_array = np.zeros(4)\n",
" for action in range(0, 4):\n",
" action_array[action] = np.sum(np.multiply(u, np.dot(v, T[:,:,action])))\n",
" return reward + gamma * np.max(action_array)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Utility of state (1,1): 0.7056\n"
]
}
],
"source": [
"def main():\n",
" \n",
" #The agent starts from (1, 1)\n",
" v = np.array([[0.0, 0.0, 0.0, 0.0, \n",
" 0.0, 0.0, 0.0, 0.0, \n",
" 1.0, 0.0, 0.0, 0.0]])\n",
" \n",
" #file loaded from the folder\n",
" T = np.load(\"T.npy\")\n",
"\n",
" #Utility vector\n",
" u = np.array([[0.812, 0.868, 0.918, 1.0,\n",
" 0.762, 0.0, 0.660, -1.0,\n",
" 0.705, 0.655, 0.611, 0.388]])\n",
"\n",
" #Define the reward for state (1,1)\n",
" reward = -0.04\n",
" #Assume that the discount factor is equal to 1.0\n",
" gamma = 1.0\n",
"\n",
" #Use the Bellman equation to find the utility of state (1,1)\n",
" utility_11 = state_utility(v, T, u, reward, gamma)\n",
" print(\"Utility of state (1,1): \" + str(utility_11))\n",
"\n",
"if __name__ == \"__main__\":\n",
" main()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
================================================
FILE: Reinforcement Learning/Calculating Transition Probabilities.ipynb
================================================
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
" 1. Set of possible states : S = {s0,s1,s2,......,sn} \n",
" 2. Initial State: s0 \n",
" 3. Transition Model: T(s,s')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let’s suppose we have a chain with only two states s0 and s1, where s0 is the initial state. The process is in s0 90% of the time and it can move to s1 the remaining 10% of the time. When the process is in state s1 it will remain there 50% of the time. Given this data we can create a Transition Matrix T as follows:\n",
"T=[[0.90 0.10]\n",
" [0.50 0.50]]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Computing the k-step transition probability:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"T: [[0.9 0.1]\n",
" [0.5 0.5]]\n",
"T_5: [[0.83504 0.16496]\n",
" [0.8248 0.1752 ]]\n",
"T_25: [[0.83333333 0.16666667]\n",
" [0.83333333 0.16666667]]\n",
"T_50: [[0.83333333 0.16666667]\n",
" [0.83333333 0.16666667]]\n",
"T_100: [[0.83333333 0.16666667]\n",
" [0.83333333 0.16666667]]\n"
]
}
],
"source": [
"import numpy as np\n",
"\n",
"#Here we declare the Transition Matrix T\n",
"T = np.array([[0.90, 0.10],\n",
" [0.50, 0.50]])\n",
"\n",
"#Obtain T after 5 steps\n",
"T_5 = np.linalg.matrix_power(T, 5)\n",
"\n",
"#Obtain T after 25 steps\n",
"T_25 = np.linalg.matrix_power(T, 25)\n",
"\n",
"#Obtain T after 50 steps\n",
"T_50 = np.linalg.matrix_power(T, 50)\n",
"\n",
"#Obtain T after 100 steps\n",
"T_100 = np.linalg.matrix_power(T, 100)\n",
"\n",
"#Print the matrices\n",
"print(\"T: \" + str(T))\n",
"print(\"T_5: \" + str(T_5))\n",
"print(\"T_25: \" + str(T_25))\n",
"print(\"T_50: \" + str(T_50))\n",
"print(\"T_100: \" + str(T_100))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
================================================
FILE: Reinforcement Learning/Defining Initial Distribution.ipynb
================================================
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let us now define the initial distribution which represents the state of the system at k=0.\n",
"Our system is composed of two states and we can model the initial distribution as a vector with two elements, the first element of the vector represents the probability of staying in the state s0 and the second element the probability of staying in state s1. Let’s suppose that we start from s0, the vector v representing the initial distribution will have this form:\n",
"v=(1,0)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Calculating the probability of being in a specific state after k iterations:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"v: [[1. 0.]]\n",
"v_1: [[0.9 0.1]]\n",
"v_5: [[0.83504 0.16496]]\n",
"v_25: [[0.83333333 0.16666667]]\n",
"v_50: [[0.83333333 0.16666667]]\n",
"v_100: [[0.83333333 0.16666667]]\n"
]
}
],
"source": [
"import numpy as np\n",
"\n",
"#Declare the initial distribution\n",
"v = np.array([[1.0, 0.0]])\n",
"\n",
"#Declare the Transition Matrix T(this is the same matrix used as in the file'Calculating Transition Probabilities')\n",
"T = np.array([[0.90, 0.10],\n",
" [0.50, 0.50]])\n",
"\n",
"#Obtain T after 5 steps\n",
"T_5 = np.linalg.matrix_power(T, 5)\n",
"\n",
"#Obtain T after 25 steps\n",
"T_25 = np.linalg.matrix_power(T, 25)\n",
"\n",
"#Obtain T after 50 steps\n",
"T_50 = np.linalg.matrix_power(T, 50)\n",
"\n",
"#Obtain T after 100 steps\n",
"T_100 = np.linalg.matrix_power(T, 100)\n",
"\n",
"#Printing the initial distribution\n",
"print(\"v: \" + str(v))\n",
"print(\"v_1: \" + str(np.dot(v,T)))\n",
"print(\"v_5: \" + str(np.dot(v,T_5)))\n",
"print(\"v_25: \" + str(np.dot(v,T_25)))\n",
"print(\"v_50: \" + str(np.dot(v,T_50)))\n",
"print(\"v_100: \" + str(np.dot(v,T_100)))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The result after 50 and 100 iterations are the same and v_50 is equal to v_100 no matter which starting distribution we have. The chain converged to equilibrium meaning that as the time progresses it forgets about the starting distribution."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
================================================
FILE: Reinforcement Learning/Policy Iteration Algorithm.ipynb
================================================
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Policy iteration is guaranteed to converge and at convergence, the current policy and its utility function are the optimal policy and the optimal utility function. First of all, we define a policy π which assigns an action to each state. We can assign random actions to this policy, it does not matter.\n",
"Once we evaluate the policy we can improve it. The policy improvement is the second and last step of the algorithm. Our environment has a finite number of states and then a finite number of policies. Each iteration yields to a better policy."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Implementing the policy iteration algorithm:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"def return_policy_evaluation(p, u, r, T, gamma):\n",
"\n",
" #v is the state vector\n",
" #T is the transition matrix\n",
" #u is the utility vector\n",
" #reward consists of the rewards earned for moving to a particular state\n",
" #gamma is the discount factor by which rewards are discounted over the time\n",
" for s in range(12):\n",
" if not np.isnan(p[s]):\n",
" v = np.zeros((1,12))\n",
" v[0,s] = 1.0\n",
" action = int(p[s])\n",
" u[s] = r[s] + gamma * np.sum(np.multiply(u, np.dot(v, T[:,:,action])))\n",
" return u"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"def return_expected_action(u, T, v):\n",
" \n",
"# It returns an action based on the\n",
"# expected utility of doing a in state s, \n",
"# according to T and u. This action is\n",
"# the one that maximize the expected\n",
"# utility.\n",
" \n",
" actions_array = np.zeros(4)\n",
" for action in range(4):\n",
" #Expected utility of doing a in state s, according to T and u.\n",
" actions_array[action] = np.sum(np.multiply(u, np.dot(v, T[:,:,action])))\n",
" return np.argmax(actions_array)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"def print_policy(p, shape):\n",
" \"\"\"Printing utility.\n",
"\n",
" Print the policy actions using symbols:\n",
" ^, v, <, > up, down, left, right\n",
" * terminal states\n",
" # obstacles\n",
" \"\"\"\n",
" counter = 0\n",
" policy_string = \"\"\n",
" for row in range(shape[0]):\n",
" for col in range(shape[1]):\n",
" if(p[counter] == -1): policy_string += \" * \" \n",
" elif(p[counter] == 0): policy_string += \" ^ \"\n",
" elif(p[counter] == 1): policy_string += \" < \"\n",
" elif(p[counter] == 2): policy_string += \" v \" \n",
" elif(p[counter] == 3): policy_string += \" > \"\n",
" elif(np.isnan(p[counter])): policy_string += \" # \"\n",
" counter += 1\n",
" policy_string += '\\n'\n",
" print(policy_string)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" v < > * \n",
" ^ # < * \n",
" < < ^ v \n",
"\n",
" ^ > > * \n",
" ^ # ^ * \n",
" < > ^ v \n",
"\n",
" > > > * \n",
" ^ # ^ * \n",
" > > ^ < \n",
"\n",
" > > > * \n",
" ^ # ^ * \n",
" > > ^ < \n",
"\n",
" > > > * \n",
" ^ # ^ * \n",
" ^ > ^ < \n",
"\n",
" > > > * \n",
" ^ # ^ * \n",
" ^ > ^ < \n",
"\n",
" > > > * \n",
" ^ # ^ * \n",
" ^ < ^ < \n",
"\n",
" > > > * \n",
" ^ # ^ * \n",
" ^ < ^ < \n",
"\n",
" > > > * \n",
" ^ # ^ * \n",
" ^ < ^ < \n",
"\n",
" > > > * \n",
" ^ # ^ * \n",
" ^ < < < \n",
"\n",
" > > > * \n",
" ^ # ^ * \n",
" ^ < < < \n",
"\n",
" > > > * \n",
" ^ # ^ * \n",
" ^ < < < \n",
"\n",
" > > > * \n",
" ^ # ^ * \n",
" ^ < < < \n",
"\n",
" > > > * \n",
" ^ # ^ * \n",
" ^ < < < \n",
"\n",
" > > > * \n",
" ^ # ^ * \n",
" ^ < < < \n",
"\n",
" > > > * \n",
" ^ # ^ * \n",
" ^ < < < \n",
"\n",
" > > > * \n",
" ^ # ^ * \n",
" ^ < < < \n",
"\n",
" > > > * \n",
" ^ # ^ * \n",
" ^ < < < \n",
"\n",
" > > > * \n",
" ^ # ^ * \n",
" ^ < < < \n",
"\n",
" > > > * \n",
" ^ # ^ * \n",
" ^ < < < \n",
"\n",
" > > > * \n",
" ^ # ^ * \n",
" ^ < < < \n",
"\n",
"=================== FINAL RESULT ==================\n",
"Iterations: 22\n",
"Delta: 9.043213450299348e-08\n",
"Gamma: 0.999\n",
"Epsilon: 0.0001\n",
"===================================================\n",
"[0.80796344 0.86539911 0.91653199 1. ]\n",
"[ 0.75696624 0. 0.65836281 -1. ]\n",
"[0.69968295 0.64882105 0.60471972 0.38150427]\n",
"===================================================\n",
" > > > * \n",
" ^ # ^ * \n",
" ^ < < < \n",
"\n",
"===================================================\n"
]
}
],
"source": [
"def main():\n",
" gamma = 0.999\n",
" epsilon = 0.0001\n",
" iteration = 0\n",
" T = np.load(\"T.npy\")\n",
" #Generate the first policy randomly\n",
" # NaN=Nothing, -1=Terminal, 0=Up, 1=Left, 2=Down, 3=Right\n",
" p = np.random.randint(0, 4, size=(12)).astype(np.float32)\n",
" p[5] = np.NaN\n",
" p[3] = p[7] = -1\n",
" #Utility vectors\n",
" u = np.array([0.0, 0.0, 0.0, 0.0,\n",
" 0.0, 0.0, 0.0, 0.0,\n",
" 0.0, 0.0, 0.0, 0.0])\n",
" #Reward vector\n",
" r = np.array([-0.04, -0.04, -0.04, +1.0,\n",
" -0.04, 0.0, -0.04, -1.0,\n",
" -0.04, -0.04, -0.04, -0.04])\n",
"\n",
" while True:\n",
" iteration += 1\n",
" #1- Policy evaluation\n",
" u_0 = u.copy()\n",
" u = return_policy_evaluation(p, u, r, T, gamma)\n",
" #Stopping criteria\n",
" delta = np.absolute(u - u_0).max()\n",
" if delta < epsilon * (1 - gamma) / gamma: break\n",
" for s in range(12):\n",
" if not np.isnan(p[s]) and not p[s]==-1:\n",
" v = np.zeros((1,12))\n",
" v[0,s] = 1.0\n",
" #2- Policy improvement\n",
" a = return_expected_action(u, T, v) \n",
" if a != p[s]: p[s] = a\n",
" print_policy(p, shape=(3,4))\n",
"\n",
" print(\"=================== FINAL RESULT ==================\")\n",
" print(\"Iterations: \" + str(iteration))\n",
" print(\"Delta: \" + str(delta))\n",
" print(\"Gamma: \" + str(gamma))\n",
" print(\"Epsilon: \" + str(epsilon))\n",
" print(\"===================================================\")\n",
" print(u[0:4])\n",
" print(u[4:8])\n",
" print(u[8:12])\n",
" print(\"===================================================\")\n",
" print_policy(p, shape=(3,4))\n",
" print(\"===================================================\")\n",
"\n",
"if __name__ == \"__main__\":\n",
" main()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
================================================
FILE: Reinforcement Learning/Value Iteration Algorithm.ipynb
================================================
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Value Iteration algorithm uses the calculated utilities of all the states and compares them after an equilibrium is reached to calculate which is the best move to be taken. The algorithm reaches an equlibrium and this can be known using a stopping criteria. The stopping criteria taken is when no state's utility gets changed by much between two consecutive iterations."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Implementing the Value Iteration algorithm:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"def state_utility(v, T, u, reward, gamma):\n",
" \n",
" #v is the state vector\n",
" #T is the transition matrix\n",
" #u is the utility vector\n",
" #reward consists of the rewards earned for moving to a particular state\n",
" #gamma is the discount factor by which rewards are discounted over the time\n",
"\n",
" action_array = np.zeros(4)\n",
" for action in range(0, 4):\n",
" action_array[action] = np.sum(np.multiply(u, np.dot(v, T[:,:,action])))\n",
" return reward + gamma * np.max(action_array)"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"=================== FINAL RESULT ==================\n",
"Iterations: 26\n",
"Delta: 9.511968687869743e-06\n",
"Gamma: 0.999\n",
"Epsilon: 0.01\n",
"===================================================\n",
"[0.80796341 0.86539911 0.91653199 1. ]\n",
"[ 0.75696613 0. 0.65836281 -1. ]\n",
"[0.69968168 0.64881721 0.60471137 0.3814863 ]\n",
"===================================================\n"
]
}
],
"source": [
"def main():\n",
" \n",
" tot_states = 12\n",
" gamma = 0.999 \n",
" iteration = 0 #Iteration counter\n",
" epsilon = 0.01 #Stopping criteria given a small value\n",
"\n",
" #List containing the data for each iteation\n",
" graph_list = list()\n",
"\n",
" #Transition matrix loaded from file\n",
" T = np.load(\"T.npy\")\n",
"\n",
" #Reward vector\n",
" r = np.array([-0.04, -0.04, -0.04, +1.0,\n",
" -0.04, 0.0, -0.04, -1.0,\n",
" -0.04, -0.04, -0.04, -0.04]) \n",
"\n",
" #Utility vectors\n",
" u = np.array([0.0, 0.0, 0.0, 0.0,\n",
" 0.0, 0.0, 0.0, 0.0,\n",
" 0.0, 0.0, 0.0, 0.0])\n",
" \n",
" u1 = np.array([0.0, 0.0, 0.0, 0.0,\n",
" 0.0, 0.0, 0.0, 0.0,\n",
" 0.0, 0.0, 0.0, 0.0])\n",
"\n",
" while True:\n",
" delta = 0\n",
" u = u1.copy()\n",
" iteration += 1\n",
" graph_list.append(u)\n",
" for s in range(tot_states):\n",
" reward = r[s]\n",
" v = np.zeros((1,tot_states))\n",
" v[0,s] = 1.0\n",
" u1[s] = state_utility(v, T, u, reward, gamma)\n",
" delta = max(delta, np.abs(u1[s] - u[s])) #Stopping criteria checked \n",
" \n",
" if delta < epsilon * (1 - gamma) / gamma:\n",
" print(\"=================== FINAL RESULT ==================\")\n",
" print(\"Iterations: \" + str(iteration))\n",
" print(\"Delta: \" + str(delta))\n",
" print(\"Gamma: \" + str(gamma))\n",
" print(\"Epsilon: \" + str(epsilon))\n",
" print(\"===================================================\")\n",
" print(u[0:4])\n",
" print(u[4:8])\n",
" print(u[8:12])\n",
" print(\"===================================================\")\n",
" break\n",
"\n",
"if __name__ == \"__main__\":\n",
" main()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
================================================
FILE: classification_and_regression_trees/bikeSpeedVsIq_test.txt
================================================
12.000000 121.010516
19.000000 157.337044
12.000000 116.031825
15.000000 132.124872
2.000000 52.719612
6.000000 39.058368
3.000000 50.757763
20.000000 166.740333
11.000000 115.808227
21.000000 165.582995
3.000000 41.956087
3.000000 34.432370
13.000000 116.954676
1.000000 32.112553
7.000000 50.380243
7.000000 94.107791
23.000000 188.943179
18.000000 152.637773
9.000000 104.122082
18.000000 127.805226
0.000000 83.083232
15.000000 148.180104
3.000000 38.480247
8.000000 77.597839
7.000000 75.625803
11.000000 124.620208
13.000000 125.186698
5.000000 51.165922
3.000000 31.179113
15.000000 132.505727
19.000000 137.978043
9.000000 106.481123
20.000000 172.149955
11.000000 104.116556
4.000000 22.457996
20.000000 175.735047
18.000000 165.350412
22.000000 177.461724
16.000000 138.672986
17.000000 156.791788
19.000000 150.327544
19.000000 156.992196
23.000000 163.624262
8.000000 92.537227
3.000000 32.341399
16.000000 144.445614
11.000000 119.985586
16.000000 145.149335
12.000000 113.284662
5.000000 47.742716
11.000000 115.852585
3.000000 31.579325
1.000000 43.758671
1.000000 61.049125
13.000000 132.751826
23.000000 163.233087
12.000000 115.134296
8.000000 91.370839
8.000000 86.137955
14.000000 120.857934
3.000000 33.777477
10.000000 110.831763
10.000000 104.174775
20.000000 155.920696
4.000000 30.619132
0.000000 71.880474
7.000000 86.399516
7.000000 72.632906
5.000000 58.632985
18.000000 143.584511
23.000000 187.059504
6.000000 65.067119
6.000000 69.110280
19.000000 142.388056
15.000000 137.174489
21.000000 159.719092
9.000000 102.179638
20.000000 176.416294
21.000000 146.516385
18.000000 147.808343
23.000000 154.790810
16.000000 137.385285
18.000000 166.885975
15.000000 136.989000
20.000000 144.668679
14.000000 137.060671
19.000000 140.468283
11.000000 98.344084
16.000000 132.497910
1.000000 59.143101
20.000000 152.299381
13.000000 134.487271
0.000000 77.805718
3.000000 28.543764
10.000000 97.751817
4.000000 41.223659
11.000000 110.017015
12.000000 119.391386
20.000000 158.872126
2.000000 38.776222
19.000000 150.496148
15.000000 131.505967
22.000000 179.856157
13.000000 143.090102
14.000000 142.611861
13.000000 120.757410
4.000000 27.929324
16.000000 151.530849
15.000000 148.149702
5.000000 44.188084
16.000000 141.135406
12.000000 119.817665
8.000000 80.991524
3.000000 29.308640
6.000000 48.203468
8.000000 92.179834
22.000000 162.720371
10.000000 91.971158
2.000000 33.481943
8.000000 88.528612
1.000000 54.042173
8.000000 92.002928
5.000000 45.614646
3.000000 34.319635
14.000000 129.140558
17.000000 146.807901
17.000000 157.694058
4.000000 37.080929
20.000000 169.942381
10.000000 114.675638
5.000000 34.913029
14.000000 137.889747
0.000000 79.043129
16.000000 139.084390
6.000000 53.340135
13.000000 142.772612
0.000000 73.103173
3.000000 37.717487
15.000000 134.116395
18.000000 138.748257
23.000000 180.779121
10.000000 93.721894
23.000000 166.958335
6.000000 74.473589
6.000000 73.006291
3.000000 34.178656
1.000000 33.395482
22.000000 149.933384
18.000000 154.858982
6.000000 66.121084
1.000000 60.816800
5.000000 55.681020
6.000000 61.251558
15.000000 125.452206
16.000000 134.310255
19.000000 167.999681
5.000000 40.074830
22.000000 162.658997
12.000000 109.473909
4.000000 44.743405
11.000000 122.419496
14.000000 139.852014
21.000000 160.045407
15.000000 131.999358
15.000000 135.577799
20.000000 173.494629
8.000000 82.497177
12.000000 123.122032
10.000000 97.592026
16.000000 141.345706
8.000000 79.588881
3.000000 54.308878
4.000000 36.112937
19.000000 165.005336
23.000000 172.198031
15.000000 127.699625
1.000000 47.305217
13.000000 115.489379
8.000000 103.956569
4.000000 53.669477
0.000000 76.220652
12.000000 114.153306
6.000000 74.608728
3.000000 41.339299
5.000000 21.944048
22.000000 181.455655
20.000000 171.691444
10.000000 104.299002
21.000000 168.307123
20.000000 169.556523
23.000000 175.960552
1.000000 42.554778
14.000000 137.286185
16.000000 136.126561
12.000000 119.269042
6.000000 63.426977
4.000000 27.728212
4.000000 32.687588
23.000000 151.153204
15.000000 129.767331
================================================
FILE: classification_and_regression_trees/bikeSpeedVsIq_train.txt
================================================
3.000000 46.852122
23.000000 178.676107
0.000000 86.154024
6.000000 68.707614
15.000000 139.737693
17.000000 141.988903
12.000000 94.477135
8.000000 86.083788
9.000000 97.265824
7.000000 80.400027
8.000000 83.414554
1.000000 52.525471
16.000000 127.060008
9.000000 101.639269
14.000000 146.412680
15.000000 144.157101
17.000000 152.699910
19.000000 136.669023
21.000000 166.971736
21.000000 165.467251
3.000000 38.455193
6.000000 75.557721
4.000000 22.171763
5.000000 50.321915
0.000000 74.412428
5.000000 42.052392
1.000000 42.489057
14.000000 139.185416
21.000000 140.713725
5.000000 63.222944
5.000000 56.294626
9.000000 91.674826
22.000000 173.497655
17.000000 152.692482
9.000000 113.920633
1.000000 51.552411
9.000000 100.075315
16.000000 137.803868
18.000000 135.925777
3.000000 45.550762
16.000000 149.933224
2.000000 27.914173
6.000000 62.103546
20.000000 173.942381
12.000000 119.200505
6.000000 70.730214
16.000000 156.260832
15.000000 132.467643
19.000000 161.164086
17.000000 138.031844
23.000000 169.747881
11.000000 116.761920
4.000000 34.305905
6.000000 68.841160
10.000000 119.535227
20.000000 158.104763
18.000000 138.390511
5.000000 59.375794
7.000000 80.802300
11.000000 108.611485
10.000000 91.169028
15.000000 154.104819
5.000000 51.100287
3.000000 32.334330
15.000000 150.551655
10.000000 111.023073
0.000000 87.489950
2.000000 46.726299
7.000000 92.540440
15.000000 135.715438
19.000000 152.960552
19.000000 162.789223
21.000000 167.176240
22.000000 164.323358
12.000000 104.823071
1.000000 35.554328
11.000000 114.784640
1.000000 36.819570
12.000000 130.266826
12.000000 126.053312
18.000000 153.378289
7.000000 70.089159
15.000000 139.528624
19.000000 157.137999
23.000000 183.595248
7.000000 73.431043
11.000000 128.176167
22.000000 183.181247
13.000000 112.685801
18.000000 161.634783
6.000000 63.169478
7.000000 63.393975
19.000000 165.779578
14.000000 143.973398
22.000000 185.131852
3.000000 45.275591
6.000000 62.018003
0.000000 83.193398
7.000000 76.847802
19.000000 147.087386
7.000000 62.812086
1.000000 49.910068
11.000000 102.169335
11.000000 105.108121
6.000000 63.429817
12.000000 121.301542
17.000000 163.253962
13.000000 119.588698
0.000000 87.333807
20.000000 144.484066
21.000000 168.792482
23.000000 159.751246
20.000000 162.843592
14.000000 145.664069
19.000000 146.838515
12.000000 132.049377
18.000000 155.756119
22.000000 155.686345
7.000000 73.913958
1.000000 66.761881
7.000000 65.855450
6.000000 56.271026
19.000000 155.308523
12.000000 124.372873
17.000000 136.025960
14.000000 132.996861
21.000000 172.639791
17.000000 135.672594
8.000000 90.323742
5.000000 62.462698
16.000000 159.048794
14.000000 139.991227
3.000000 37.026678
9.000000 100.839901
9.000000 93.097395
15.000000 123.645221
15.000000 147.327185
1.000000 40.055830
0.000000 88.192829
17.000000 139.174517
22.000000 169.354493
17.000000 136.354272
9.000000 90.692829
7.000000 63.987997
14.000000 128.972231
10.000000 108.433394
2.000000 49.321034
19.000000 171.615671
9.000000 97.894855
0.000000 68.962453
9.000000 72.063371
22.000000 157.000070
12.000000 114.461754
6.000000 58.239465
9.000000 104.601048
8.000000 90.772359
22.000000 164.428791
5.000000 34.804083
5.000000 37.089459
22.000000 177.987605
10.000000 89.439608
6.000000 70.711362
23.000000 181.731482
20.000000 151.538932
7.000000 66.067228
6.000000 61.565125
20.000000 184.441687
9.000000 91.569158
9.000000 98.833425
17.000000 144.352866
9.000000 94.498314
15.000000 121.922732
18.000000 166.408274
10.000000 89.571299
8.000000 75.373772
22.000000 161.001478
8.000000 90.594227
5.000000 57.180933
20.000000 161.643007
8.000000 87.197370
8.000000 95.584308
15.000000 126.207221
7.000000 84.528209
18.000000 161.056986
10.000000 86.762615
1.000000 33.325906
9.000000 105.095502
2.000000 22.440421
9.000000 93.449284
14.000000 106.249595
21.000000 163.254385
22.000000 161.746628
20.000000 152.973085
17.000000 122.918987
7.000000 58.536412
1.000000 45.013277
13.000000 137.294148
10.000000 88.123737
2.000000 45.847376
20.000000 163.385797
================================================
FILE: classification_and_regression_trees/compare.py
================================================
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from regression_tree import *
from model_tree import linear_regression
def get_corrcoef(X, Y):
# X Y 的协方差
cov = np.mean(X*Y) - np.mean(X)*np.mean(Y)
return cov/(np.var(X)*np.var(Y))**0.5
if '__main__' == __name__:
# 加载数据
data_train = load_data('bikeSpeedVsIq_train.txt')
data_test = load_data('bikeSpeedVsIq_test.txt')
dataset_test = np.matrix(data_test)
m, n = dataset_test.shape
testset = np.ones((m, n+1))
testset[:, 1:] = dataset_test
X_test, y_test = testset[:, :-1], testset[:, -1]
# 获取标准线性回归模型
w, X, y = linear_regression(data_train)
y_lr = X_test*w
y_test = np.array(y_test).T
y_lr = np.array(y_lr).T[0]
corrcoef_lr = get_corrcoef(y_test, y_lr)
print('linear regression correlation coefficient: {}'.format(corrcoef_lr))
# 获取模型树回归模型
tree = create_tree(data_train, fleaf, ferr, opt={'err_tolerance': 1,
'n_tolerance': 4})
y_tree = [tree_predict([x], tree) for x in X_test[:, 1].tolist()]
corrcoef_tree = get_corrcoef(np.array(y_tree), y_test)
print('regression tree correlation coefficient: {}'.format(corrcoef_tree))
plt.scatter(np.array(data_train)[:, 0], np.array(data_train)[:, 1])
# 绘制线性回归曲线
x = np.sort([i for i in X_test[:, 1].tolist()])
y = [np.dot([1.0, i], np.array(w.T).tolist()[0]) for i in x]
plt.plot(x, y, c='r')
# 绘制回归树回归曲线
y = [tree_predict([i], tree) for i in x]
plt.plot(x, y, c='y')
plt.show()
================================================
FILE: classification_and_regression_trees/dot/ex0.dot
================================================
digraph decision_tree {
"5db27cbb-29af-4987-9cd2-9217c781000d" [label="0: 0.400158"];
"a81daf61-ab07-4e65-8b8a-55ee0bd0b40c" [label="0: 0.208197"];
"1f1412f1-659b-4347-8013-f6e57e634c2b" [label="-0.02"];
"32292eec-1e38-4eff-9700-cba03d93d7d8" [label="1.03"];
"7cee5c66-0140-4be7-ab6e-01245b3c8199" [label="0: 0.609483"];
"3308a031-f17e-494c-b015-9b5f3d904dba" [label="1.98"];
"d53cb038-a0fa-4635-9a07-36d40c33d6b9" [label="0: 0.816742"];
"adeac9bb-ef8a-4a91-a821-bade5e047d0d" [label="2.98"];
"208da594-0e82-4b01-af38-b60bac08624d" [label="3.99"];
"5db27cbb-29af-4987-9cd2-9217c781000d" -> "a81daf61-ab07-4e65-8b8a-55ee0bd0b40c";
"a81daf61-ab07-4e65-8b8a-55ee0bd0b40c" -> "1f1412f1-659b-4347-8013-f6e57e634c2b";
"a81daf61-ab07-4e65-8b8a-55ee0bd0b40c" -> "32292eec-1e38-4eff-9700-cba03d93d7d8";
"5db27cbb-29af-4987-9cd2-9217c781000d" -> "7cee5c66-0140-4be7-ab6e-01245b3c8199";
"7cee5c66-0140-4be7-ab6e-01245b3c8199" -> "3308a031-f17e-494c-b015-9b5f3d904dba";
"7cee5c66-0140-4be7-ab6e-01245b3c8199" -> "d53cb038-a0fa-4635-9a07-36d40c33d6b9";
"d53cb038-a0fa-4635-9a07-36d40c33d6b9" -> "adeac9bb-ef8a-4a91-a821-bade5e047d0d";
"d53cb038-a0fa-4635-9a07-36d40c33d6b9" -> "208da594-0e82-4b01-af38-b60bac08624d";
}
================================================
FILE: classification_and_regression_trees/dot/ex00.dot
================================================
digraph decision_tree {
"ccd352d8-dbf6-4f59-ae0b-c983f39e5c87" [label="0: 0.50794"];
"46052817-27f4-4748-8f02-43ee4d2315dc" [label="-0.04"];
"b2df42be-284f-415f-8db7-18f807450a5b" [label="1.02"];
"ccd352d8-dbf6-4f59-ae0b-c983f39e5c87" -> "46052817-27f4-4748-8f02-43ee4d2315dc";
"ccd352d8-dbf6-4f59-ae0b-c983f39e5c87" -> "b2df42be-284f-415f-8db7-18f807450a5b";
}
================================================
FILE: classification_and_regression_trees/dot/ex2.dot
================================================
digraph decision_tree {
"bdbe6f68-a446-4539-8a80-860f22663afe" [label="0: 0.508542"];
"acef94b2-b18f-4c9c-bb21-4caa8279f319" [label="0: 0.463241"];
"31303341-5a1f-4167-83ce-22e8ea1e462f" [label="0: 0.130626"];
"1ee2e839-eb76-48b6-8149-a694de0bc740" [label="0: 0.085111"];
"1294ba07-97b4-44da-b30f-6dc8de1f2506" [label="0: 0.053764"];
"bb2ea4a5-e090-43bd-a89f-47459726a906" [label="4.09"];
"cf4e8349-aae6-4f47-b2e3-bdfad76140b9" [label="-2.54"];
"746064bd-117c-46d6-bbbd-8f3943d2b418" [label="6.51"];
"3c7d308c-4094-44ed-a689-23971c62ea5a" [label="0: 0.377383"];
"8cd38fe6-64f6-48f5-b3da-1755767c9c9e" [label="0: 0.3417"];
"8fa7515c-8c27-400c-b2eb-fa95acd8f8c0" [label="0: 0.32889"];
"c7820d96-a919-408a-a48d-6472c6b6cbe4" [label="0: 0.300318"];
"046be329-b0aa-49f7-b3d6-7fc216d478c9" [label="0: 0.176523"];
"0a30919d-85b8-4c93-bfbb-f1fcff96dba6" [label="0: 0.156273"];
"4ad75fa1-e0b7-4678-be73-5de3775d397e" [label="-6.25"];
"398d6223-1ef9-4c45-9ac7-44f61850dc45" [label="-12.11"];
"3a51d811-12ca-46b5-8548-f40a6571b263" [label="0: 0.203993"];
"2460b0c3-0648-4d24-8a57-e996df15a425" [label="3.45"];
"a595f63d-5bc6-4901-8802-749a77568531" [label="0: 0.218321"];
"a4cedd79-a7f9-4d67-acd2-bba51563dc30" [label="-11.82"];
"9e2b62f5-877e-4c05-82e1-8264fe924106" [label="0: 0.228628"];
"47a064c2-9c82-4e09-a069-80e6cd741aba" [label="6.77"];
"78788a82-c815-4891-8933-d0d2d39e8ace" [label="0: 0.264639"];
"83554092-3407-44a5-89d9-b1954126079b" [label="-13.07"];
"26d04a07-eb77-4691-aca6-3bf94fae2063" [label="0.40"];
"aabebca3-28f6-4bb4-aa90-9393dfa9479c" [label="-19.99"];
"217e5a9d-99f6-48f6-b60d-4940de06b873" [label="15.06"];
"322a229e-98a5-4f9d-b6ce-5be216bb9b66" [label="0: 0.351478"];
"20c1cd26-5c8f-45ef-8557-34bdc7e7e936" [label="-22.69"];
"d142ca3c-8c45-4ff5-a975-5614ec532660" [label="-15.09"];
"dd4807c3-e5dd-4139-85a3-673bc6384037" [label="0: 0.446196"];
"98190fc7-96ed-4c83-b200-1305f1663d12" [label="0: 0.418943"];
"a2389757-85a4-4e78-82e5-3ddcb540cb5f" [label="0: 0.388789"];
"3d1b079c-d685-4810-b7f4-c88824625a53" [label="3.66"];
"0d29f899-b06a-4c70-aa65-9fbf6794eca4" [label="-0.89"];
"25f74f36-c5be-4b83-a233-0c555c86ead9" [label="14.38"];
"cadc3d94-2545-4a48-be70-6915c3fa2795" [label="-12.56"];
"a57b9084-97cc-450d-bc7d-8feefef45a82" [label="0: 0.483803"];
"8b548b72-2bc2-499c-bb77-12b597d0e793" [label="3.43"];
"a8a356da-2c12-474c-a463-8110c1f6b8bf" [label="12.51"];
"e9657d5a-ad84-4e6f-acfe-6cfaa5f88780" [label="0: 0.731636"];
"39dce041-ce2f-44d7-bdbd-45cf38da4af8" [label="0: 0.642373"];
"9aebff68-5a03-4060-899d-345ef1fddace" [label="0: 0.618868"];
"5e1c0e83-b7d6-48b0-9ef5-560cbca09f44" [label="0: 0.585413"];
"7daf39a6-a515-4506-a06e-23af1edbec6a" [label="0: 0.560301"];
"097d7583-bea8-4b59-8cf6-bb1a7ec82295" [label="0: 0.531944"];
"f4e8defb-d0a3-4328-9fb1-835d3f85e406" [label="101.74"];
"b5524dcb-e110-4fca-95ba-eec513d60fdb" [label="0: 0.546601"];
"c68f0abe-5d5a-4493-8cf7-6818f9601baa" [label="110.98"];
"6bb0d9b4-4be8-468e-a888-19c08287fce3" [label="109.39"];
"d51fb426-e5ec-416b-b844-e069df5f6dc8" [label="97.20"];
"09721c80-fa8e-4cec-b962-32a79348d015" [label="123.21"];
"841fee93-2890-490a-b2c8-8bb720c32ea6" [label="93.67"];
"c53bc2c2-dfc2-4375-9f88-9736f075c294" [label="0: 0.667851"];
"c55165f2-d3c8-4b70-8d8c-ace5415b172f" [label="114.15"];
"a0ae2570-2f84-4f3f-aa45-689cdc3f64cf" [label="0: 0.70889"];
"b640a2a1-5fd8-4063-869a-ff60c957dbc2" [label="0: 0.69892"];
"1c5d7da8-d76c-444c-aa07-c8811130765d" [label="108.93"];
"1fe9a864-a467-4aaf-b9f1-87bfc45c25b5" [label="104.82"];
"c7358239-c55b-413d-aa76-540e5994f84f" [label="114.55"];
"709c2602-24e3-4f9a-bb0d-8c089399c018" [label="0: 0.953902"];
"80f583a9-c4eb-41af-bef3-cfb7d38d41ee" [label="0: 0.763328"];
"9c72f602-2240-4ea7-9c0b-f7269fc86618" [label="78.09"];
"6b400587-213d-483b-8ad0-110dc6d85507" [label="0: 0.798198"];
"73c06578-7739-4122-98e0-63cb9dc39f81" [label="102.36"];
"8ac0c1b7-2b2d-44b4-99f4-282aae0fe308" [label="0: 0.838587"];
"633bb485-db8e-4d5d-8d68-1fc08619a673" [label="0: 0.815215"];
"0b343b88-8a2a-49ae-a2d7-bd624044698b" [label="88.78"];
"7da8e0bd-b3b8-4092-a34e-1ba8cd3ffbf7" [label="81.11"];
"d3bb79be-7b85-4c0f-b5b7-dbb6bd4c54a2" [label="0: 0.948822"];
"f3081916-79ff-4e39-8b30-fa054670c50f" [label="0: 0.856421"];
"829fae81-4a91-46f9-8c6f-7559dd670841" [label="95.28"];
"4d529db0-b857-4910-a482-9e61bc87f959" [label="0: 0.912161"];
"818b003f-8abd-4ee8-a4ea-b596111a4f77" [label="0: 0.896683"];
"c5f07126-a0a7-4d03-8ee2-b60dd344013f" [label="0: 0.883615"];
"f9034542-ada8-48be-9db2-e4c9b2be70de" [label="102.25"];
"3257b4c9-7a8a-4cb7-a620-0980005ecb6d" [label="95.18"];
"6f8ddb81-d225-41e7-b1ec-fa6185f55097" [label="104.83"];
"b5e03953-ad6c-4a39-981a-140d7fafc46a" [label="96.45"];
"e85378cb-074a-4d35-ae82-a9a99f9daf50" [label="87.31"];
"dfe3dd59-018c-41ab-8046-21ed7d136b4a" [label="0: 0.960398"];
"2d37edbb-d4d3-4b55-a593-8a2b1ab3ff25" [label="112.43"];
"f075116f-c182-4b0d-9f7e-2dae61982c18" [label="105.25"];
"bdbe6f68-a446-4539-8a80-860f22663afe" -> "acef94b2-b18f-4c9c-bb21-4caa8279f319";
"acef94b2-b18f-4c9c-bb21-4caa8279f319" -> "31303341-5a1f-4167-83ce-22e8ea1e462f";
"31303341-5a1f-4167-83ce-22e8ea1e462f" -> "1ee2e839-eb76-48b6-8149-a694de0bc740";
"1ee2e839-eb76-48b6-8149-a694de0bc740" -> "1294ba07-97b4-44da-b30f-6dc8de1f2506";
"1294ba07-97b4-44da-b30f-6dc8de1f2506" -> "bb2ea4a5-e090-43bd-a89f-47459726a906";
"1294ba07-97b4-44da-b30f-6dc8de1f2506" -> "cf4e8349-aae6-4f47-b2e3-bdfad76140b9";
"1ee2e839-eb76-48b6-8149-a694de0bc740" -> "746064bd-117c-46d6-bbbd-8f3943d2b418";
"31303341-5a1f-4167-83ce-22e8ea1e462f" -> "3c7d308c-4094-44ed-a689-23971c62ea5a";
"3c7d308c-4094-44ed-a689-23971c62ea5a" -> "8cd38fe6-64f6-48f5-b3da-1755767c9c9e";
"8cd38fe6-64f6-48f5-b3da-1755767c9c9e" -> "8fa7515c-8c27-400c-b2eb-fa95acd8f8c0";
"8fa7515c-8c27-400c-b2eb-fa95acd8f8c0" -> "c7820d96-a919-408a-a48d-6472c6b6cbe4";
"c7820d96-a919-408a-a48d-6472c6b6cbe4" -> "046be329-b0aa-49f7-b3d6-7fc216d478c9";
"046be329-b0aa-49f7-b3d6-7fc216d478c9" -> "0a30919d-85b8-4c93-bfbb-f1fcff96dba6";
"0a30919d-85b8-4c93-bfbb-f1fcff96dba6" -> "4ad75fa1-e0b7-4678-be73-5de3775d397e";
"0a30919d-85b8-4c93-bfbb-f1fcff96dba6" -> "398d6223-1ef9-4c45-9ac7-44f61850dc45";
"046be329-b0aa-49f7-b3d6-7fc216d478c9" -> "3a51d811-12ca-46b5-8548-f40a6571b263";
"3a51d811-12ca-46b5-8548-f40a6571b263" -> "2460b0c3-0648-4d24-8a57-e996df15a425";
"3a51d811-12ca-46b5-8548-f40a6571b263" -> "a595f63d-5bc6-4901-8802-749a77568531";
"a595f63d-5bc6-4901-8802-749a77568531" -> "a4cedd79-a7f9-4d67-acd2-bba51563dc30";
"a595f63d-5bc6-4901-8802-749a77568531" -> "9e2b62f5-877e-4c05-82e1-8264fe924106";
"9e2b62f5-877e-4c05-82e1-8264fe924106" -> "47a064c2-9c82-4e09-a069-80e6cd741aba";
"9e2b62f5-877e-4c05-82e1-8264fe924106" -> "78788a82-c815-4891-8933-d0d2d39e8ace";
"78788a82-c815-4891-8933-d0d2d39e8ace" -> "83554092-3407-44a5-89d9-b1954126079b";
"78788a82-c815-4891-8933-d0d2d39e8ace" -> "26d04a07-eb77-4691-aca6-3bf94fae2063";
"c7820d96-a919-408a-a48d-6472c6b6cbe4" -> "aabebca3-28f6-4bb4-aa90-9393dfa9479c";
"8fa7515c-8c27-400c-b2eb-fa95acd8f8c0" -> "217e5a9d-99f6-48f6-b60d-4940de06b873";
"8cd38fe6-64f6-48f5-b3da-1755767c9c9e" -> "322a229e-98a5-4f9d-b6ce-5be216bb9b66";
"322a229e-98a5-4f9d-b6ce-5be216bb9b66" -> "20c1cd26-5c8f-45ef-8557-34bdc7e7e936";
"322a229e-98a5-4f9d-b6ce-5be216bb9b66" -> "d142ca3c-8c45-4ff5-a975-5614ec532660";
"3c7d308c-4094-44ed-a689-23971c62ea5a" -> "dd4807c3-e5dd-4139-85a3-673bc6384037";
"dd4807c3-e5dd-4139-85a3-673bc6384037" -> "98190fc7-96ed-4c83-b200-1305f1663d12";
"98190fc7-96ed-4c83-b200-1305f1663d12" -> "a2389757-85a4-4e78-82e5-3ddcb540cb5f";
"a2389757-85a4-4e78-82e5-3ddcb540cb5f" -> "3d1b079c-d685-4810-b7f4-c88824625a53";
"a2389757-85a4-4e78-82e5-3ddcb540cb5f" -> "0d29f899-b06a-4c70-aa65-9fbf6794eca4";
"98190fc7-96ed-4c83-b200-1305f1663d12" -> "25f74f36-c5be-4b83-a233-0c555c86ead9";
"dd4807c3-e5dd-4139-85a3-673bc6384037" -> "cadc3d94-2545-4a48-be70-6915c3fa2795";
"acef94b2-b18f-4c9c-bb21-4caa8279f319" -> "a57b9084-97cc-450d-bc7d-8feefef45a82";
"a57b9084-97cc-450d-bc7d-8feefef45a82" -> "8b548b72-2bc2-499c-bb77-12b597d0e793";
"a57b9084-97cc-450d-bc7d-8feefef45a82" -> "a8a356da-2c12-474c-a463-8110c1f6b8bf";
"bdbe6f68-a446-4539-8a80-860f22663afe" -> "e9657d5a-ad84-4e6f-acfe-6cfaa5f88780";
"e9657d5a-ad84-4e6f-acfe-6cfaa5f88780" -> "39dce041-ce2f-44d7-bdbd-45cf38da4af8";
"39dce041-ce2f-44d7-bdbd-45cf38da4af8" -> "9aebff68-5a03-4060-899d-345ef1fddace";
"9aebff68-5a03-4060-899d-345ef1fddace" -> "5e1c0e83-b7d6-48b0-9ef5-560cbca09f44";
"5e1c0e83-b7d6-48b0-9ef5-560cbca09f44" -> "7daf39a6-a515-4506-a06e-23af1edbec6a";
"7daf39a6-a515-4506-a06e-23af1edbec6a" -> "097d7583-bea8-4b59-8cf6-bb1a7ec82295";
"097d7583-bea8-4b59-8cf6-bb1a7ec82295" -> "f4e8defb-d0a3-4328-9fb1-835d3f85e406";
"097d7583-bea8-4b59-8cf6-bb1a7ec82295" -> "b5524dcb-e110-4fca-95ba-eec513d60fdb";
"b5524dcb-e110-4fca-95ba-eec513d60fdb" -> "c68f0abe-5d5a-4493-8cf7-6818f9601baa";
"b5524dcb-e110-4fca-95ba-eec513d60fdb" -> "6bb0d9b4-4be8-468e-a888-19c08287fce3";
"7daf39a6-a515-4506-a06e-23af1edbec6a" -> "d51fb426-e5ec-416b-b844-e069df5f6dc8";
"5e1c0e83-b7d6-48b0-9ef5-560cbca09f44" -> "09721c80-fa8e-4cec-b962-32a79348d015";
"9aebff68-5a03-4060-899d-345ef1fddace" -> "841fee93-2890-490a-b2c8-8bb720c32ea6";
"39dce041-ce2f-44d7-bdbd-45cf38da4af8" -> "c53bc2c2-dfc2-4375-9f88-9736f075c294";
"c53bc2c2-dfc2-4375-9f88-9736f075c294" -> "c55165f2-d3c8-4b70-8d8c-ace5415b172f";
"c53bc2c2-dfc2-4375-9f88-9736f075c294" -> "a0ae2570-2f84-4f3f-aa45-689cdc3f64cf";
"a0ae2570-2f84-4f3f-aa45-689cdc3f64cf" -> "b640a2a1-5fd8-4063-869a-ff60c957dbc2";
"b640a2a1-5fd8-4063-869a-ff60c957dbc2" -> "1c5d7da8-d76c-444c-aa07-c8811130765d";
"b640a2a1-5fd8-4063-869a-ff60c957dbc2" -> "1fe9a864-a467-4aaf-b9f1-87bfc45c25b5";
"a0ae2570-2f84-4f3f-aa45-689cdc3f64cf" -> "c7358239-c55b-413d-aa76-540e5994f84f";
"e9657d5a-ad84-4e6f-acfe-6cfaa5f88780" -> "709c2602-24e3-4f9a-bb0d-8c089399c018";
"709c2602-24e3-4f9a-bb0d-8c089399c018" -> "80f583a9-c4eb-41af-bef3-cfb7d38d41ee";
"80f583a9-c4eb-41af-bef3-cfb7d38d41ee" -> "9c72f602-2240-4ea7-9c0b-f7269fc86618";
"80f583a9-c4eb-41af-bef3-cfb7d38d41ee" -> "6b400587-213d-483b-8ad0-110dc6d85507";
"6b400587-213d-483b-8ad0-110dc6d85507" -> "73c06578-7739-4122-98e0-63cb9dc39f81";
"6b400587-213d-483b-8ad0-110dc6d85507" -> "8ac0c1b7-2b2d-44b4-99f4-282aae0fe308";
"8ac0c1b7-2b2d-44b4-99f4-282aae0fe308" -> "633bb485-db8e-4d5d-8d68-1fc08619a673";
"633bb485-db8e-4d5d-8d68-1fc08619a673" -> "0b343b88-8a2a-49ae-a2d7-bd624044698b";
"633bb485-db8e-4d5d-8d68-1fc08619a673" -> "7da8e0bd-b3b8-4092-a34e-1ba8cd3ffbf7";
"8ac0c1b7-2b2d-44b4-99f4-282aae0fe308" -> "d3bb79be-7b85-4c0f-b5b7-dbb6bd4c54a2";
"d3bb79be-7b85-4c0f-b5b7-dbb6bd4c54a2" -> "f3081916-79ff-4e39-8b30-fa054670c50f";
"f3081916-79ff-4e39-8b30-fa054670c50f" -> "829fae81-4a91-46f9-8c6f-7559dd670841";
"f3081916-79ff-4e39-8b30-fa054670c50f" -> "4d529db0-b857-4910-a482-9e61bc87f959";
"4d529db0-b857-4910-a482-9e61bc87f959" -> "818b003f-8abd-4ee8-a4ea-b596111a4f77";
"818b003f-8abd-4ee8-a4ea-b596111a4f77" -> "c5f07126-a0a7-4d03-8ee2-b60dd344013f";
"c5f07126-a0a7-4d03-8ee2-b60dd344013f" -> "f9034542-ada8-48be-9db2-e4c9b2be70de";
"c5f07126-a0a7-4d03-8ee2-b60dd344013f" -> "3257b4c9-7a8a-4cb7-a620-0980005ecb6d";
"818b003f-8abd-4ee8-a4ea-b596111a4f77" -> "6f8ddb81-d225-41e7-b1ec-fa6185f55097";
"4d529db0-b857-4910-a482-9e61bc87f959" -> "b5e03953-ad6c-4a39-981a-140d7fafc46a";
"d3bb79be-7b85-4c0f-b5b7-dbb6bd4c54a2" -> "e85378cb-074a-4d35-ae82-a9a99f9daf50";
"709c2602-24e3-4f9a-bb0d-8c089399c018" -> "dfe3dd59-018c-41ab-8046-21ed7d136b4a";
"dfe3dd59-018c-41ab-8046-21ed7d136b4a" -> "2d37edbb-d4d3-4b55-a593-8a2b1ab3ff25";
"dfe3dd59-018c-41ab-8046-21ed7d136b4a" -> "f075116f-c182-4b0d-9f7e-2dae61982c18";
}
================================================
FILE: classification_and_regression_trees/dot/ex2_prune.dot
================================================
digraph decision_tree {
"c4bff19d-b75d-4b50-99e8-34f696a77644" [label="0: 0.508542"];
"68b83894-3568-462c-a8c2-a2cfa600d44c" [label="0: 0.463241"];
"8fb7d681-5bb9-487b-804f-592a8760babb" [label="0: 0.130626"];
"ffb50925-bc5a-405b-aae5-15ba23da800d" [label="0: 0.085111"];
"92c528b2-54f3-488d-a817-b99003f2be3a" [label="0.77"];
"8f89c68f-ec97-4e89-9757-6d4d349dfb0b" [label="6.51"];
"ca28f207-af7e-43c9-8e5d-36a5e9cd3be5" [label="0: 0.377383"];
"14779c7f-cb2a-4452-b325-a27aef8b1af8" [label="0: 0.3417"];
"560ae6ed-a17a-46b8-a400-9e78920cf41b" [label="0: 0.32889"];
"288dea95-80b6-4638-8fb1-a1b657a19a73" [label="0: 0.300318"];
"9ac61729-5c19-4e91-915b-20beb665ceaa" [label="0: 0.176523"];
"42f1a782-408b-4c15-8bf0-13a01f968f81" [label="-9.18"];
"5f466b9c-fe6c-4465-97af-2520e6b66286" [label="0: 0.203993"];
"0a18504c-63a2-4af1-868c-38c591a02a60" [label="3.45"];
"74c75b17-a2a7-4bf0-b7d0-c4e4e916a29e" [label="0: 0.218321"];
"d1373766-52cd-4196-babe-eff0093695fe" [label="-11.82"];
"cfdc11bb-2ae8-4807-b3f1-c8e27c6f2317" [label="0: 0.228628"];
"480b7cfa-89e7-434d-9980-09192f674081" [label="6.77"];
"d56fe489-c3a4-4bcd-ace9-56c0e2510ec4" [label="0: 0.264639"];
"675438ce-b6e6-4489-8756-653ee5e37021" [label="-13.07"];
"2a8f2806-4eac-45ea-a820-eb776a0d8689" [label="0.40"];
"7993ad91-42aa-4b70-91db-5fbfed178b89" [label="-19.99"];
"f55749c7-c885-4d1a-b741-c3bd5c09f2dd" [label="15.06"];
"4f8fcefe-a83c-4aae-bff7-8ff98dec1643" [label="0: 0.351478"];
"e87b5ff0-822f-48c0-ae82-d32f92dd2111" [label="-22.69"];
"37a37975-c5da-4c85-9c02-d597c9041252" [label="-15.09"];
"0521b42e-1682-418f-a005-fe3d5098505a" [label="0: 0.446196"];
"c7762b6e-66e6-4a15-8458-f9ff2cc5cac6" [label="0: 0.418943"];
"266b231b-6083-474d-9ba8-8dbc1e4fec60" [label="1.38"];
"5981dfef-12f2-4ecf-83ef-fb111c6db412" [label="14.38"];
"fa23660c-72b3-49e7-a1f6-11038e6d0c2b" [label="-12.56"];
"c5ec6450-a9d7-4173-bca6-bb19e0eb3816" [label="0: 0.483803"];
"f31655f3-1003-4d88-8fcb-d57d75cf914e" [label="3.43"];
"f29cc8c5-ed1d-4b57-b16c-c657dd787edc" [label="12.51"];
"e74f8324-62e8-4d39-bd77-75ff0f2998f4" [label="0: 0.731636"];
"22f47068-79d7-4f15-8375-fca2b236b75f" [label="0: 0.642373"];
"d706789e-3e3b-4a19-8c5d-f97c17092013" [label="0: 0.618868"];
"72712f51-6089-4f84-bfac-5211971a9785" [label="0: 0.585413"];
"ac1505e9-2492-4269-a36d-bb93042e0af7" [label="0: 0.560301"];
"fc6c2bb4-7b7b-4d3f-b979-8ca27d29be5b" [label="0: 0.531944"];
"74d4917b-5afd-4078-bf91-22911bf0286f" [label="101.74"];
"eb5af9d7-1dc0-4e12-9b6f-d2de3d1dd6a7" [label="110.18"];
"d84a0c88-9b2c-4932-9f0f-c4c52d82de76" [label="97.20"];
"31a5cd3b-dbf9-4943-b90e-9a6d3a0a2a21" [label="123.21"];
"0790ed24-11c3-4310-a28c-fdd2d20ebed1" [label="93.67"];
"993e9dfe-e330-48c9-a4d5-44b6124403f0" [label="0: 0.667851"];
"8f434490-e990-4724-98b8-53c4b2cdbed5" [label="114.15"];
"4dc134f3-4882-45c0-bdac-eddc21bc10df" [label="0: 0.70889"];
"48da8c69-8998-4d8d-bd99-b268ed3281b9" [label="106.88"];
"393f82d6-a9a0-40e7-9c48-b3dc22233ef6" [label="114.55"];
"87ee99b7-59a7-49af-816d-ac7a0dc7d0cb" [label="0: 0.953902"];
"c5768cbc-7bae-45f3-80a0-ed03b930766b" [label="0: 0.763328"];
"633c832d-b589-4db2-a15e-0369eee79d08" [label="78.09"];
"e81e22e5-514c-42f7-859f-d21419008366" [label="0: 0.798198"];
"16ac6188-471c-439d-8503-da094a54ec82" [label="102.36"];
"0ed901ab-9dfb-4e60-b878-62b692e7dddc" [label="0: 0.838587"];
"e6c982d1-2c92-4787-9af6-d0f25432b035" [label="84.95"];
"7fa3f95d-e8a4-452d-bd78-dae6ad591cca" [label="0: 0.948822"];
"37d790b6-4448-4f11-b6c4-fa117cba9992" [label="0: 0.856421"];
"e068d29d-a65a-4c65-a5e6-3321f93076ba" [label="95.28"];
"7031a55c-261c-486b-ad1c-c94a66412706" [label="0: 0.912161"];
"5a91e296-ef36-4790-9e42-d191c8f5830a" [label="0: 0.896683"];
"c63246a4-4508-41f8-b7fe-412ac81e918d" [label="98.72"];
"0615432d-ca65-458e-971a-bfd6201edbaa" [label="104.83"];
"64df91ab-df9b-4c98-9e0d-4f8b2d2a4794" [label="96.45"];
"b1351d64-cf46-4654-9446-1bf4b89c0c32" [label="87.31"];
"e0dd7e12-6468-41b5-b63b-1c9a04ab682b" [label="108.84"];
"c4bff19d-b75d-4b50-99e8-34f696a77644" -> "68b83894-3568-462c-a8c2-a2cfa600d44c";
"68b83894-3568-462c-a8c2-a2cfa600d44c" -> "8fb7d681-5bb9-487b-804f-592a8760babb";
"8fb7d681-5bb9-487b-804f-592a8760babb" -> "ffb50925-bc5a-405b-aae5-15ba23da800d";
"ffb50925-bc5a-405b-aae5-15ba23da800d" -> "92c528b2-54f3-488d-a817-b99003f2be3a";
"ffb50925-bc5a-405b-aae5-15ba23da800d" -> "8f89c68f-ec97-4e89-9757-6d4d349dfb0b";
"8fb7d681-5bb9-487b-804f-592a8760babb" -> "ca28f207-af7e-43c9-8e5d-36a5e9cd3be5";
"ca28f207-af7e-43c9-8e5d-36a5e9cd3be5" -> "14779c7f-cb2a-4452-b325-a27aef8b1af8";
"14779c7f-cb2a-4452-b325-a27aef8b1af8" -> "560ae6ed-a17a-46b8-a400-9e78920cf41b";
"560ae6ed-a17a-46b8-a400-9e78920cf41b" -> "288dea95-80b6-4638-8fb1-a1b657a19a73";
"288dea95-80b6-4638-8fb1-a1b657a19a73" -> "9ac61729-5c19-4e91-915b-20beb665ceaa";
"9ac61729-5c19-4e91-915b-20beb665ceaa" -> "42f1a782-408b-4c15-8bf0-13a01f968f81";
"9ac61729-5c19-4e91-915b-20beb665ceaa" -> "5f466b9c-fe6c-4465-97af-2520e6b66286";
"5f466b9c-fe6c-4465-97af-2520e6b66286" -> "0a18504c-63a2-4af1-868c-38c591a02a60";
"5f466b9c-fe6c-4465-97af-2520e6b66286" -> "74c75b17-a2a7-4bf0-b7d0-c4e4e916a29e";
"74c75b17-a2a7-4bf0-b7d0-c4e4e916a29e" -> "d1373766-52cd-4196-babe-eff0093695fe";
"74c75b17-a2a7-4bf0-b7d0-c4e4e916a29e" -> "cfdc11bb-2ae8-4807-b3f1-c8e27c6f2317";
"cfdc11bb-2ae8-4807-b3f1-c8e27c6f2317" -> "480b7cfa-89e7-434d-9980-09192f674081";
"cfdc11bb-2ae8-4807-b3f1-c8e27c6f2317" -> "d56fe489-c3a4-4bcd-ace9-56c0e2510ec4";
"d56fe489-c3a4-4bcd-ace9-56c0e2510ec4" -> "675438ce-b6e6-4489-8756-653ee5e37021";
"d56fe489-c3a4-4bcd-ace9-56c0e2510ec4" -> "2a8f2806-4eac-45ea-a820-eb776a0d8689";
"288dea95-80b6-4638-8fb1-a1b657a19a73" -> "7993ad91-42aa-4b70-91db-5fbfed178b89";
"560ae6ed-a17a-46b8-a400-9e78920cf41b" -> "f55749c7-c885-4d1a-b741-c3bd5c09f2dd";
"14779c7f-cb2a-4452-b325-a27aef8b1af8" -> "4f8fcefe-a83c-4aae-bff7-8ff98dec1643";
"4f8fcefe-a83c-4aae-bff7-8ff98dec1643" -> "e87b5ff0-822f-48c0-ae82-d32f92dd2111";
"4f8fcefe-a83c-4aae-bff7-8ff98dec1643" -> "37a37975-c5da-4c85-9c02-d597c9041252";
"ca28f207-af7e-43c9-8e5d-36a5e9cd3be5" -> "0521b42e-1682-418f-a005-fe3d5098505a";
"0521b42e-1682-418f-a005-fe3d5098505a" -> "c7762b6e-66e6-4a15-8458-f9ff2cc5cac6";
"c7762b6e-66e6-4a15-8458-f9ff2cc5cac6" -> "266b231b-6083-474d-9ba8-8dbc1e4fec60";
"c7762b6e-66e6-4a15-8458-f9ff2cc5cac6" -> "5981dfef-12f2-4ecf-83ef-fb111c6db412";
"0521b42e-1682-418f-a005-fe3d5098505a" -> "fa23660c-72b3-49e7-a1f6-11038e6d0c2b";
"68b83894-3568-462c-a8c2-a2cfa600d44c" -> "c5ec6450-a9d7-4173-bca6-bb19e0eb3816";
"c5ec6450-a9d7-4173-bca6-bb19e0eb3816" -> "f31655f3-1003-4d88-8fcb-d57d75cf914e";
"c5ec6450-a9d7-4173-bca6-bb19e0eb3816" -> "f29cc8c5-ed1d-4b57-b16c-c657dd787edc";
"c4bff19d-b75d-4b50-99e8-34f696a77644" -> "e74f8324-62e8-4d39-bd77-75ff0f2998f4";
"e74f8324-62e8-4d39-bd77-75ff0f2998f4" -> "22f47068-79d7-4f15-8375-fca2b236b75f";
"22f47068-79d7-4f15-8375-fca2b236b75f" -> "d706789e-3e3b-4a19-8c5d-f97c17092013";
"d706789e-3e3b-4a19-8c5d-f97c17092013" -> "72712f51-6089-4f84-bfac-5211971a9785";
"72712f51-6089-4f84-bfac-5211971a9785" -> "ac1505e9-2492-4269-a36d-bb93042e0af7";
"ac1505e9-2492-4269-a36d-bb93042e0af7" -> "fc6c2bb4-7b7b-4d3f-b979-8ca27d29be5b";
"fc6c2bb4-7b7b-4d3f-b979-8ca27d29be5b" -> "74d4917b-5afd-4078-bf91-22911bf0286f";
"fc6c2bb4-7b7b-4d3f-b979-8ca27d29be5b" -> "eb5af9d7-1dc0-4e12-9b6f-d2de3d1dd6a7";
"ac1505e9-2492-4269-a36d-bb93042e0af7" -> "d84a0c88-9b2c-4932-9f0f-c4c52d82de76";
"72712f51-6089-4f84-bfac-5211971a9785" -> "31a5cd3b-dbf9-4943-b90e-9a6d3a0a2a21";
"d706789e-3e3b-4a19-8c5d-f97c17092013" -> "0790ed24-11c3-4310-a28c-fdd2d20ebed1";
"22f47068-79d7-4f15-8375-fca2b236b75f" -> "993e9dfe-e330-48c9-a4d5-44b6124403f0";
"993e9dfe-e330-48c9-a4d5-44b6124403f0" -> "8f434490-e990-4724-98b8-53c4b2cdbed5";
"993e9dfe-e330-48c9-a4d5-44b6124403f0" -> "4dc134f3-4882-45c0-bdac-eddc21bc10df";
"4dc134f3-4882-45c0-bdac-eddc21bc10df" -> "48da8c69-8998-4d8d-bd99-b268ed3281b9";
"4dc134f3-4882-45c0-bdac-eddc21bc10df" -> "393f82d6-a9a0-40e7-9c48-b3dc22233ef6";
"e74f8324-62e8-4d39-bd77-75ff0f2998f4" -> "87ee99b7-59a7-49af-816d-ac7a0dc7d0cb";
"87ee99b7-59a7-49af-816d-ac7a0dc7d0cb" -> "c5768cbc-7bae-45f3-80a0-ed03b930766b";
"c5768cbc-7bae-45f3-80a0-ed03b930766b" -> "633c832d-b589-4db2-a15e-0369eee79d08";
"c5768cbc-7bae-45f3-80a0-ed03b930766b" -> "e81e22e5-514c-42f7-859f-d21419008366";
"e81e22e5-514c-42f7-859f-d21419008366" -> "16ac6188-471c-439d-8503-da094a54ec82";
"e81e22e5-514c-42f7-859f-d21419008366" -> "0ed901ab-9dfb-4e60-b878-62b692e7dddc";
"0ed901ab-9dfb-4e60-b878-62b692e7dddc" -> "e6c982d1-2c92-4787-9af6-d0f25432b035";
"0ed901ab-9dfb-4e60-b878-62b692e7dddc" -> "7fa3f95d-e8a4-452d-bd78-dae6ad591cca";
"7fa3f95d-e8a4-452d-bd78-dae6ad591cca" -> "37d790b6-4448-4f11-b6c4-fa117cba9992";
"37d790b6-4448-4f11-b6c4-fa117cba9992" -> "e068d29d-a65a-4c65-a5e6-3321f93076ba";
"37d790b6-4448-4f11-b6c4-fa117cba9992" -> "7031a55c-261c-486b-ad1c-c94a66412706";
"7031a55c-261c-486b-ad1c-c94a66412706" -> "5a91e296-ef36-4790-9e42-d191c8f5830a";
"5a91e296-ef36-4790-9e42-d191c8f5830a" -> "c63246a4-4508-41f8-b7fe-412ac81e918d";
"5a91e296-ef36-4790-9e42-d191c8f5830a" -> "0615432d-ca65-458e-971a-bfd6201edbaa";
"7031a55c-261c-486b-ad1c-c94a66412706" -> "64df91ab-df9b-4c98-9e0d-4f8b2d2a4794";
"7fa3f95d-e8a4-452d-bd78-dae6ad591cca" -> "b1351d64-cf46-4654-9446-1bf4b89c0c32";
"87ee99b7-59a7-49af-816d-ac7a0dc7d0cb" -> "e0dd7e12-6468-41b5-b63b-1c9a04ab682b";
}
================================================
FILE: classification_and_regression_trees/dot/exp2.dot
================================================
digraph decision_tree {
"5c49cf77-b404-459e-b4fd-513a927807dc" [label="0: 0.304401"];
"83d1a5dd-ca47-4f50-845b-387b99fa210e" [label="[3.4687793552577886, 1.1852174309187824]"];
"dff8f3c5-1acc-4500-993a-7ab19e72d907" [label="[0.0016985569361161585, 11.964773944276974]"];
"5c49cf77-b404-459e-b4fd-513a927807dc" -> "83d1a5dd-ca47-4f50-845b-387b99fa210e";
"5c49cf77-b404-459e-b4fd-513a927807dc" -> "dff8f3c5-1acc-4500-993a-7ab19e72d907";
}
================================================
FILE: classification_and_regression_trees/ex0.txt
================================================
0.409175 1.883180
0.182603 0.063908
0.663687 3.042257
0.517395 2.305004
0.013643 -0.067698
0.469643 1.662809
0.725426 3.275749
0.394350 1.118077
0.507760 2.095059
0.237395 1.181912
0.057534 0.221663
0.369820 0.938453
0.976819 4.149409
0.616051 3.105444
0.413700 1.896278
0.105279 -0.121345
0.670273 3.161652
0.952758 4.135358
0.272316 0.859063
0.303697 1.170272
0.486698 1.687960
0.511810 1.979745
0.195865 0.068690
0.986769 4.052137
0.785623 3.156316
0.797583 2.950630
0.081306 0.068935
0.659753 2.854020
0.375270 0.999743
0.819136 4.048082
0.142432 0.230923
0.215112 0.816693
0.041270 0.130713
0.044136 -0.537706
0.131337 -0.339109
0.463444 2.124538
0.671905 2.708292
0.946559 4.017390
0.904176 4.004021
0.306674 1.022555
0.819006 3.657442
0.845472 4.073619
0.156258 0.011994
0.857185 3.640429
0.400158 1.808497
0.375395 1.431404
0.885807 3.935544
0.239960 1.162152
0.148640 -0.227330
0.143143 -0.068728
0.321582 0.825051
0.509393 2.008645
0.355891 0.664566
0.938633 4.180202
0.348057 0.864845
0.438898 1.851174
0.781419 2.761993
0.911333 4.075914
0.032469 0.110229
0.499985 2.181987
0.771663 3.152528
0.670361 3.046564
0.176202 0.128954
0.392170 1.062726
0.911188 3.651742
0.872288 4.401950
0.733107 3.022888
0.610239 2.874917
0.732739 2.946801
0.714825 2.893644
0.076386 0.072131
0.559009 1.748275
0.427258 1.912047
0.841875 3.710686
0.558918 1.719148
0.533241 2.174090
0.956665 3.656357
0.620393 3.522504
0.566120 2.234126
0.523258 1.859772
0.476884 2.097017
0.176408 0.001794
0.303094 1.231928
0.609731 2.953862
0.017774 -0.116803
0.622616 2.638864
0.886539 3.943428
0.148654 -0.328513
0.104350 -0.099866
0.116868 -0.030836
0.516514 2.359786
0.664896 3.212581
0.004327 0.188975
0.425559 1.904109
0.743671 3.007114
0.935185 3.845834
0.697300 3.079411
0.444551 1.939739
0.683753 2.880078
0.755993 3.063577
0.902690 4.116296
0.094491 -0.240963
0.873831 4.066299
0.991810 4.011834
0.185611 0.077710
0.694551 3.103069
0.657275 2.811897
0.118746 -0.104630
0.084302 0.025216
0.945341 4.330063
0.785827 3.087091
0.530933 2.269988
0.879594 4.010701
0.652770 3.119542
0.879338 3.723411
0.764739 2.792078
0.504884 2.192787
0.554203 2.081305
0.493209 1.714463
0.363783 0.885854
0.316465 1.028187
0.580283 1.951497
0.542898 1.709427
0.112661 0.144068
0.816742 3.880240
0.234175 0.921876
0.402804 1.979316
0.709423 3.085768
0.867298 3.476122
0.993392 3.993679
0.711580 3.077880
0.133643 -0.105365
0.052031 -0.164703
0.366806 1.096814
0.697521 3.092879
0.787262 2.987926
0.476710 2.061264
0.721417 2.746854
0.230376 0.716710
0.104397 0.103831
0.197834 0.023776
0.129291 -0.033299
0.528528 1.942286
0.009493 -0.006338
0.998533 3.808753
0.363522 0.652799
0.901386 4.053747
0.832693 4.569290
0.119002 -0.032773
0.487638 2.066236
0.153667 0.222785
0.238619 1.089268
0.208197 1.487788
0.750921 2.852033
0.183403 0.024486
0.995608 3.737750
0.151311 0.045017
0.126804 0.001238
0.983153 3.892763
0.772495 2.819376
0.784133 2.830665
0.056934 0.234633
0.425584 1.810782
0.998709 4.237235
0.707815 3.034768
0.413816 1.742106
0.217152 1.169250
0.360503 0.831165
0.977989 3.729376
0.507953 1.823205
0.920771 4.021970
0.210542 1.262939
0.928611 4.159518
0.580373 2.039114
0.841390 4.101837
0.681530 2.778672
0.292795 1.228284
0.456918 1.736620
0.134128 -0.195046
0.016241 -0.063215
0.691214 3.305268
0.582002 2.063627
0.303102 0.898840
0.622598 2.701692
0.525024 1.992909
0.996775 3.811393
0.881025 4.353857
0.723457 2.635641
0.676346 2.856311
0.254625 1.352682
0.488632 2.336459
0.519875 2.111651
0.160176 0.121726
0.609483 3.264605
0.531881 2.103446
0.321632 0.896855
0.845148 4.220850
0.012003 -0.217283
0.018883 -0.300577
0.071476 0.006014
================================================
FILE: classification_and_regression_trees/ex00.txt
================================================
0.036098 0.155096
0.993349 1.077553
0.530897 0.893462
0.712386 0.564858
0.343554 -0.371700
0.098016 -0.332760
0.691115 0.834391
0.091358 0.099935
0.727098 1.000567
0.951949 0.945255
0.768596 0.760219
0.541314 0.893748
0.146366 0.034283
0.673195 0.915077
0.183510 0.184843
0.339563 0.206783
0.517921 1.493586
0.703755 1.101678
0.008307 0.069976
0.243909 -0.029467
0.306964 -0.177321
0.036492 0.408155
0.295511 0.002882
0.837522 1.229373
0.202054 -0.087744
0.919384 1.029889
0.377201 -0.243550
0.814825 1.095206
0.611270 0.982036
0.072243 -0.420983
0.410230 0.331722
0.869077 1.114825
0.620599 1.334421
0.101149 0.068834
0.820802 1.325907
0.520044 0.961983
0.488130 -0.097791
0.819823 0.835264
0.975022 0.673579
0.953112 1.064690
0.475976 -0.163707
0.273147 -0.455219
0.804586 0.924033
0.074795 -0.349692
0.625336 0.623696
0.656218 0.958506
0.834078 1.010580
0.781930 1.074488
0.009849 0.056594
0.302217 -0.148650
0.678287 0.907727
0.180506 0.103676
0.193641 -0.327589
0.343479 0.175264
0.145809 0.136979
0.996757 1.035533
0.590210 1.336661
0.238070 -0.358459
0.561362 1.070529
0.377597 0.088505
0.099142 0.025280
0.539558 1.053846
0.790240 0.533214
0.242204 0.209359
0.152324 0.132858
0.252649 -0.055613
0.895930 1.077275
0.133300 -0.223143
0.559763 1.253151
0.643665 1.024241
0.877241 0.797005
0.613765 1.621091
0.645762 1.026886
0.651376 1.315384
0.697718 1.212434
0.742527 1.087056
0.901056 1.055900
0.362314 -0.556464
0.948268 0.631862
0.000234 0.060903
0.750078 0.906291
0.325412 -0.219245
0.726828 1.017112
0.348013 0.048939
0.458121 -0.061456
0.280738 -0.228880
0.567704 0.969058
0.750918 0.748104
0.575805 0.899090
0.507940 1.107265
0.071769 -0.110946
0.553520 1.391273
0.401152 -0.121640
0.406649 -0.366317
0.652121 1.004346
0.347837 -0.153405
0.081931 -0.269756
0.821648 1.280895
0.048014 0.064496
0.130962 0.184241
0.773422 1.125943
0.789625 0.552614
0.096994 0.227167
0.625791 1.244731
0.589575 1.185812
0.323181 0.180811
0.822443 1.086648
0.360323 -0.204830
0.950153 1.022906
0.527505 0.879560
0.860049 0.717490
0.007044 0.094150
0.438367 0.034014
0.574573 1.066130
0.536689 0.867284
0.782167 0.886049
0.989888 0.744207
0.761474 1.058262
0.985425 1.227946
0.132543 -0.329372
0.346986 -0.150389
0.768784 0.899705
0.848921 1.170959
0.449280 0.069098
0.066172 0.052439
0.813719 0.706601
0.661923 0.767040
0.529491 1.022206
0.846455 0.720030
0.448656 0.026974
0.795072 0.965721
0.118156 -0.077409
0.084248 -0.019547
0.845815 0.952617
0.576946 1.234129
0.772083 1.299018
0.696648 0.845423
0.595012 1.213435
0.648675 1.287407
0.897094 1.240209
0.552990 1.036158
0.332982 0.210084
0.065615 -0.306970
0.278661 0.253628
0.773168 1.140917
0.203693 -0.064036
0.355688 -0.119399
0.988852 1.069062
0.518735 1.037179
0.514563 1.156648
0.976414 0.862911
0.919074 1.123413
0.697777 0.827805
0.928097 0.883225
0.900272 0.996871
0.344102 -0.061539
0.148049 0.204298
0.130052 -0.026167
0.302001 0.317135
0.337100 0.026332
0.314924 -0.001952
0.269681 -0.165971
0.196005 -0.048847
0.129061 0.305107
0.936783 1.026258
0.305540 -0.115991
0.683921 1.414382
0.622398 0.766330
0.902532 0.861601
0.712503 0.933490
0.590062 0.705531
0.723120 1.307248
0.188218 0.113685
0.643601 0.782552
0.520207 1.209557
0.233115 -0.348147
0.465625 -0.152940
0.884512 1.117833
0.663200 0.701634
0.268857 0.073447
0.729234 0.931956
0.429664 -0.188659
0.737189 1.200781
0.378595 -0.296094
0.930173 1.035645
0.774301 0.836763
0.273940 -0.085713
0.824442 1.082153
0.626011 0.840544
0.679390 1.307217
0.578252 0.921885
0.785541 1.165296
0.597409 0.974770
0.014083 -0.132525
0.663870 1.187129
0.552381 1.369630
0.683886 0.999985
0.210334 -0.006899
0.604529 1.212685
0.250744 0.046297
================================================
FILE: classification_and_regression_trees/ex2.dot
================================================
digraph decision_tree {
"e1b05249-eb8e-4afd-837c-d2f5a5299a6a" [label="0: 0.508542"];
"b82d5e44-41de-40ec-8558-fad039b53058" [label="-2.64"];
"0b668e3e-42eb-4735-a6ba-420826ffc809" [label="0: 0.731636"];
"e1a950cd-cd46-4ce1-941e-c59c56377d2e" [label="107.69"];
"b2ee8f32-0401-4b83-a2ee-3f9212b6d8a1" [label="96.32"];
"e1b05249-eb8e-4afd-837c-d2f5a5299a6a" -> "b82d5e44-41de-40ec-8558-fad039b53058";
"e1b05249-eb8e-4afd-837c-d2f5a5299a6a" -> "0b668e3e-42eb-4735-a6ba-420826ffc809";
"0b668e3e-42eb-4735-a6ba-420826ffc809" -> "e1a950cd-cd46-4ce1-941e-c59c56377d2e";
"0b668e3e-42eb-4735-a6ba-420826ffc809" -> "b2ee8f32-0401-4b83-a2ee-3f9212b6d8a1";
}
================================================
FILE: classification_and_regression_trees/ex2.txt
================================================
0.228628 -2.266273
0.965969 112.386764
0.342761 -31.584855
0.901444 87.300625
0.585413 125.295113
0.334900 18.976650
0.769043 64.041941
0.297107 -1.798377
0.901421 100.133819
0.176523 0.946348
0.710234 108.553919
0.981980 86.399637
0.085873 -10.137104
0.537834 90.995536
0.806158 62.877698
0.708890 135.416767
0.787755 118.642009
0.463241 17.171057
0.300318 -18.051318
0.815215 118.319942
0.139880 7.336784
0.068373 -15.160836
0.457563 -34.044555
0.665652 105.547997
0.084661 -24.132226
0.954711 100.935789
0.953902 130.926480
0.487381 27.729263
0.759504 81.106762
0.454312 -20.360067
0.295993 -14.988279
0.156067 7.557349
0.428582 15.224266
0.847219 76.240984
0.499171 11.924204
0.203993 -22.379119
0.548539 83.114502
0.790312 110.159730
0.937766 119.949824
0.218321 1.410768
0.223200 15.501642
0.896683 107.001620
0.582311 82.589328
0.698920 92.470636
0.823848 59.342323
0.385021 24.816941
0.061219 6.695567
0.841547 115.669032
0.763328 115.199195
0.934853 115.753994
0.222271 -9.255852
0.217214 -3.958752
0.706961 106.180427
0.888426 94.896354
0.549814 137.267576
0.107960 -1.293195
0.085111 37.820659
0.388789 21.578007
0.467383 -9.712925
0.623909 87.181863
0.373501 -8.228297
0.513332 101.075609
0.350725 -40.086564
0.716211 103.345308
0.731636 73.912028
0.273863 -9.457556
0.211633 -8.332207
0.944221 100.120253
0.053764 -13.731698
0.126833 22.891675
0.952833 100.649591
0.391609 3.001104
0.560301 82.903945
0.124723 -1.402796
0.465680 -23.777531
0.699873 115.586605
0.164134 -27.405211
0.455761 9.841938
0.508542 96.403373
0.138619 -29.087463
0.335182 2.768225
0.908629 118.513475
0.546601 96.319043
0.378965 13.583555
0.968621 98.648346
0.637999 91.656617
0.350065 -1.319852
0.632691 93.645293
0.936524 65.548418
0.310956 -49.939516
0.437652 19.745224
0.166765 -14.740059
0.571214 114.872056
0.952377 73.520802
0.665329 121.980607
0.258070 -20.425137
0.912161 85.005351
0.777582 100.838446
0.642707 82.500766
0.885676 108.045948
0.080061 2.229873
0.039914 11.220099
0.958512 135.837013
0.377383 5.241196
0.661073 115.687524
0.454375 3.043912
0.412516 -26.419289
0.854970 89.209930
0.698472 120.521925
0.465561 30.051931
0.328890 39.783113
0.309133 8.814725
0.418943 44.161493
0.553797 120.857321
0.799873 91.368473
0.811363 112.981216
0.785574 107.024467
0.949198 105.752508
0.666452 120.014736
0.652462 112.715799
0.290749 -14.391613
0.508548 93.292829
0.680486 110.367074
0.356790 -19.526539
0.199903 -3.372472
0.264926 5.280579
0.166431 -6.512506
0.370042 -32.124495
0.628061 117.628346
0.228473 19.425158
0.044737 3.855393
0.193282 18.208423
0.519150 116.176162
0.351478 -0.461116
0.872199 111.552716
0.115150 13.795828
0.324274 -13.189243
0.446196 -5.108172
0.613004 168.180746
0.533511 129.766743
0.740859 93.773929
0.667851 92.449664
0.900699 109.188248
0.599142 130.378529
0.232802 1.222318
0.838587 134.089674
0.284794 35.623746
0.130626 -39.524461
0.642373 140.613941
0.786865 100.598825
0.403228 -1.729244
0.883615 95.348184
0.910975 106.814667
0.819722 70.054508
0.798198 76.853728
0.606417 93.521396
0.108801 -16.106164
0.318309 -27.605424
0.856421 107.166848
0.842940 95.893131
0.618868 76.917665
0.531944 124.795495
0.028546 -8.377094
0.915263 96.717610
0.925782 92.074619
0.624827 105.970743
0.331364 -1.290825
0.341700 -23.547711
0.342155 -16.930416
0.729397 110.902830
0.640515 82.713621
0.228751 -30.812912
0.948822 69.318649
0.706390 105.062147
0.079632 29.420068
0.451087 -28.724685
0.833026 76.723835
0.589806 98.674874
0.426711 -21.594268
0.872883 95.887712
0.866451 94.402102
0.960398 123.559747
0.483803 5.224234
0.811602 99.841379
0.757527 63.549854
0.569327 108.435392
0.841625 60.552308
0.264639 2.557923
0.202161 -1.983889
0.055862 -3.131497
0.543843 98.362010
0.689099 112.378209
0.956951 82.016541
0.382037 -29.007783
0.131833 22.478291
0.156273 0.225886
0.000256 9.668106
0.892999 82.436686
0.206207 -12.619036
0.487537 5.149336
================================================
FILE: classification_and_regression_trees/ex2test.txt
================================================
0.421862 10.830241
0.105349 -2.241611
0.155196 21.872976
0.161152 2.015418
0.382632 -38.778979
0.017710 20.109113
0.129656 15.266887
0.613926 111.900063
0.409277 1.874731
0.807556 111.223754
0.593722 133.835486
0.953239 110.465070
0.257402 15.332899
0.645385 93.983054
0.563460 93.645277
0.408338 -30.719878
0.874394 91.873505
0.263805 -0.192752
0.411198 10.751118
0.449884 9.211901
0.646315 113.533660
0.673718 125.135638
0.805148 113.300462
0.759327 72.668572
0.519172 82.131698
0.741031 106.777146
0.030937 9.859127
0.268848 -34.137955
0.474901 -11.201301
0.588266 120.501998
0.893936 142.826476
0.870990 105.751746
0.430763 39.146258
0.057665 15.371897
0.100076 9.131761
0.980716 116.145896
0.235289 -13.691224
0.228098 16.089151
0.622248 99.345551
0.401467 -1.694383
0.960334 110.795415
0.031214 -5.330042
0.504228 96.003525
0.779660 75.921582
0.504496 101.341462
0.850974 96.293064
0.701119 102.333839
0.191551 5.072326
0.667116 92.310019
0.555584 80.367129
0.680006 132.965442
0.393899 38.605283
0.048940 -9.861871
0.963282 115.407485
0.655496 104.269918
0.576463 141.127267
0.675708 96.227996
0.853457 114.252288
0.003933 -12.182861
0.549512 97.927224
0.218967 -4.712462
0.659972 120.950439
0.008256 8.026816
0.099500 -14.318434
0.352215 -3.747546
0.874926 89.247356
0.635084 99.496059
0.039641 14.147109
0.665111 103.298719
0.156583 -2.540703
0.648843 119.333019
0.893237 95.209585
0.128807 5.558479
0.137438 5.567685
0.630538 98.462792
0.296084 -41.799438
0.632099 84.895098
0.987681 106.726447
0.744909 111.279705
0.862030 104.581156
0.080649 -7.679985
0.831277 59.053356
0.198716 26.878801
0.860932 90.632930
0.883250 92.759595
0.818003 110.272219
0.949216 115.200237
0.460078 -35.957981
0.561077 93.545761
0.863767 114.125786
0.476891 -29.774060
0.537826 81.587922
0.686224 110.911198
0.982327 119.114523
0.944453 92.033481
0.078227 30.216873
0.782937 92.588646
0.465886 2.222139
0.885024 90.247890
0.186077 7.144415
0.915828 84.010074
0.796649 115.572156
0.127821 28.933688
0.433429 6.782575
0.946796 108.574116
0.386915 -17.404601
0.561192 92.142700
0.182490 10.764616
0.878792 95.289476
0.381342 -6.177464
0.358474 -11.731754
0.270647 13.793201
0.488904 -17.641832
0.106773 5.684757
0.270112 4.335675
0.754985 75.860433
0.585174 111.640154
0.458821 12.029692
0.218017 -26.234872
0.583887 99.413850
0.923626 107.802298
0.833620 104.179678
0.870691 93.132591
0.249896 -8.618404
0.748230 109.160652
0.019365 34.048884
0.837588 101.239275
0.529251 115.514729
0.742898 67.038771
0.522034 64.160799
0.498982 3.983061
0.479439 24.355908
0.314834 -14.256200
0.753251 85.017092
0.479362 -17.480446
0.950593 99.072784
0.718623 58.080256
0.218720 -19.605593
0.664113 94.437159
0.942900 131.725134
0.314226 18.904871
0.284509 11.779346
0.004962 -14.624176
0.224087 -50.547649
0.974331 112.822725
0.894610 112.863995
0.167350 0.073380
0.753644 105.024456
0.632241 108.625812
0.314189 -6.090797
0.965527 87.418343
0.820919 94.610538
0.144107 -4.748387
0.072556 -5.682008
0.002447 29.685714
0.851007 79.632376
0.458024 -12.326026
0.627503 139.458881
0.422259 -29.827405
0.714659 63.480271
0.672320 93.608554
0.498592 37.112975
0.698906 96.282845
0.861441 99.699230
0.112425 -12.419909
0.164784 5.244704
0.481531 -18.070497
0.375482 1.779411
0.089325 -14.216755
0.036609 -6.264372
0.945004 54.723563
0.136608 14.970936
0.292285 -41.723711
0.029195 -0.660279
0.998307 100.124230
0.303928 -5.492264
0.957863 117.824392
0.815089 113.377704
0.466399 -10.249874
0.876693 115.617275
0.536121 102.997087
0.373984 -37.359936
0.565162 74.967476
0.085412 -21.449563
0.686411 64.859620
0.908752 107.983366
0.982829 98.005424
0.052766 -42.139502
0.777552 91.899340
0.374316 -3.522501
0.060231 10.008227
0.526225 87.317722
0.583872 67.104433
0.238276 10.615159
0.678747 60.624273
0.067649 15.947398
0.530182 105.030933
0.869389 104.969996
0.698410 75.460417
0.549430 82.558068
================================================
FILE: classification_and_regression_trees/exp.txt
================================================
0.529582 100.737303
0.985730 103.106872
0.797869 99.666151
0.393473 -1.773056
0.272568 -1.170222
0.758825 96.752440
0.218359 2.337347
0.926357 98.343231
0.726881 99.633009
0.805311 102.253834
0.208632 0.493174
0.184921 -2.231071
0.660135 100.139355
0.871875 96.637420
0.657182 100.345442
0.942481 97.751546
0.427843 -1.380170
0.845958 98.195303
0.878696 99.380485
0.582034 100.971036
0.118114 2.397033
0.144718 1.304535
0.576046 101.624714
0.750305 97.601324
0.518281 100.093634
0.260793 -1.361888
0.390245 -2.973759
0.963020 98.877859
0.880661 97.631997
0.291780 -1.638124
0.192903 -2.221257
0.461442 -1.074725
0.821171 99.372052
0.144557 2.589464
0.379346 0.991090
0.383822 1.832389
0.055406 -1.870700
0.084308 -0.611701
0.719578 100.087948
0.417471 -0.510292
0.477894 -3.426525
0.871228 100.307522
0.113074 -1.011079
0.409434 -0.616173
0.967141 96.551856
0.938254 97.052196
0.079989 2.083496
0.150207 1.285491
0.417339 -0.462985
0.038787 -2.237234
0.954657 102.111432
0.844894 98.350138
0.106770 -0.998182
0.247831 2.483594
0.108687 -0.920229
0.758165 98.079399
0.199978 -3.490410
0.600602 99.850119
0.026466 1.342825
0.141239 -0.949858
0.181437 -2.223725
0.352656 2.251362
0.803371 99.647157
0.677303 100.414859
0.561674 99.133372
0.497533 -3.764935
0.523327 98.452850
0.507075 103.807755
0.791978 99.414598
0.956890 95.977239
0.487927 1.199149
0.788795 100.012047
0.554283 98.522458
0.814361 97.642150
0.788940 97.399942
0.515845 102.240479
0.758538 97.461917
0.041824 -3.294141
0.341352 1.246559
0.194801 -2.285278
0.805528 99.023113
0.435762 0.361749
0.941615 100.746547
0.478234 0.791146
0.057445 -4.266792
0.510079 98.845273
0.209900 -0.861890
0.902668 101.429190
0.456602 -2.856392
0.997595 99.828241
0.048240 -0.268920
0.319531 0.896696
0.264929 -1.000487
0.432727 -4.630489
0.419828 1.260534
0.667056 99.456518
0.488173 1.574322
0.746300 100.563503
0.528660 100.736739
0.624185 99.562872
0.169411 1.809929
0.011025 4.132846
0.974164 98.706049
0.267957 0.297803
0.726093 99.381040
0.465163 -2.344545
0.993698 101.507792
0.816513 99.903496
0.398756 0.378060
0.054974 -0.588770
0.857067 100.322945
0.362328 2.551786
0.316961 -0.528283
0.167881 -0.376517
0.393776 3.658204
0.739991 100.426554
0.457949 0.857428
0.060635 2.484776
0.942634 101.254420
0.553691 102.467820
0.394694 -0.248353
0.714625 99.650556
0.273503 1.111820
0.471886 -5.665559
0.746476 98.720163
0.140209 0.471820
0.024197 -2.854251
0.521287 99.703915
0.672280 100.463227
0.380342 -0.785713
0.956380 99.482209
0.455254 1.613841
0.647551 101.591193
0.682498 98.267734
0.054839 -2.286019
0.716849 100.614510
0.217732 -2.161633
0.918885 100.260067
0.576026 101.719788
0.868511 100.669152
0.661135 97.637969
0.166334 1.374014
0.106850 -3.658050
0.768242 104.193841
0.240916 -0.368100
0.124957 2.821672
0.984335 98.571444
0.908524 101.777344
0.861217 98.656403
0.944295 100.154508
0.527278 101.052710
0.717072 100.788373
0.130227 0.115694
0.494734 -1.220681
0.498733 0.961514
0.519411 101.331622
0.712409 104.891067
0.933858 98.180299
0.266051 0.398961
0.153690 -0.657128
0.209181 1.486816
0.942699 102.187578
0.766799 100.213348
0.862578 101.816969
0.223266 2.854445
0.611394 103.428497
0.996212 98.494158
0.724945 99.098450
0.399346 0.879259
0.750510 98.729864
0.446060 0.639843
0.999913 101.502887
0.111561 3.256383
0.094755 0.170475
0.366547 0.488994
0.179924 -0.871567
0.969023 99.982789
0.941420 100.416754
0.656851 98.520940
0.983166 99.546591
0.167843 0.033922
0.316245 2.171137
0.817118 102.849575
0.173642 1.209173
0.411030 2.022640
0.265041 2.216470
0.779660 98.475428
0.059354 -0.929568
0.722092 97.974003
0.511958 101.924447
0.371938 -0.640602
0.851009 97.873330
0.375918 -5.308115
0.797332 99.763778
0.107749 -3.770092
0.156937 -0.876724
0.960447 99.597097
0.413434 2.408090
0.644257 100.453125
0.119332 -0.495588
================================================
FILE: classification_and_regression_trees/exp2.dot
================================================
digraph decision_tree {
"c830d5ff-5d25-4637-a268-2bb63f5d4351" [label="0: 0.304401"];
"44889deb-3d44-405b-a7cf-d8dfa5604cb9" [label="[3.4687793552577886, 1.1852174309187824]"];
"4a419f47-2097-4b6e-b01e-047203bf4370" [label="[0.0016985569361161585, 11.964773944276974]"];
"c830d5ff-5d25-4637-a268-2bb63f5d4351" -> "44889deb-3d44-405b-a7cf-d8dfa5604cb9";
"c830d5ff-5d25-4637-a268-2bb63f5d4351" -> "4a419f47-2097-4b6e-b01e-047203bf4370";
}
================================================
FILE: classification_and_regression_trees/exp2.txt
================================================
0.070670 3.470829
0.534076 6.377132
0.747221 8.949407
0.668970 8.034081
0.586082 6.997721
0.764962 9.318110
0.658125 7.880333
0.346734 4.213359
0.313967 3.762496
0.601418 7.188805
0.404396 4.893403
0.154345 3.683175
0.984061 11.712928
0.597514 7.146694
0.005144 3.333150
0.142295 3.743681
0.280007 3.737376
0.542008 6.494275
0.466781 5.532255
0.706970 8.476718
0.191038 3.673921
0.756591 9.176722
0.912879 10.850358
0.524701 6.067444
0.306090 3.681148
0.429009 5.032168
0.695091 8.209058
0.984495 11.909595
0.702748 8.298454
0.551771 6.715210
0.272894 3.983313
0.014611 3.559081
0.699852 8.417306
0.309710 3.739053
0.444877 5.219649
0.717509 8.483072
0.576550 6.894860
0.284200 3.792626
0.675922 8.067282
0.304401 3.671373
0.233675 3.795962
0.453779 5.477533
0.900938 10.701447
0.502418 6.046703
0.781843 9.254690
0.226271 3.546938
0.619535 7.703312
0.519998 6.202835
0.399447 4.934647
0.785298 9.497564
0.010767 3.565835
0.696399 8.307487
0.524366 6.266060
0.396583 4.611390
0.059988 3.484805
0.946702 11.263118
0.417559 4.895128
0.609194 7.239316
0.730687 8.858371
0.586694 7.061601
0.829567 9.937968
0.964229 11.521595
0.276813 3.756406
0.987041 11.947913
0.876107 10.440538
0.747582 8.942278
0.117348 3.567821
0.188617 3.976420
0.416655 4.928907
0.192995 3.978365
0.244888 3.777018
0.806349 9.685831
0.417555 4.990148
0.233805 3.740022
0.357325 4.325355
0.190201 3.638493
0.705127 8.432886
0.336599 3.868493
0.473786 5.871813
0.384794 4.830712
0.502217 6.117244
0.788220 9.454959
0.478773 5.681631
0.064296 3.642040
0.332143 3.886628
0.618869 7.312725
0.854981 10.306697
0.570000 6.764615
0.512739 6.166836
0.112285 3.545863
0.723700 8.526944
0.192256 3.661033
0.181268 3.678579
0.196731 3.916622
0.510342 6.026652
0.263713 3.723018
0.141105 3.529595
0.150262 3.552314
0.824724 9.973690
0.588088 6.893128
0.411291 4.856380
0.763717 9.199101
0.212118 3.740024
0.264587 3.742917
0.973524 11.683243
0.250670 3.679117
0.823460 9.743861
0.253752 3.781488
0.838332 10.172180
0.501156 6.113263
0.097275 3.472367
0.667199 7.948868
0.487320 6.022060
0.654640 7.809457
0.906907 10.775188
0.821941 9.936140
0.859396 10.428255
0.078696 3.490510
0.938092 11.252471
0.998868 11.863062
0.025501 3.515624
0.451806 5.441171
0.883872 10.498912
0.583567 6.912334
0.823688 10.003723
0.891032 10.818109
0.879259 10.639263
0.163007 3.662715
0.344263 4.169705
0.796083 9.422591
0.903683 10.978834
0.050129 3.575105
0.605553 7.306014
0.628951 7.556742
0.877052 10.444055
0.829402 9.856432
0.121422 3.638276
0.721517 8.663569
0.066532 3.673471
0.996587 11.782002
0.653384 7.804568
0.739494 8.817809
0.640341 7.636812
0.337828 3.971613
0.220512 3.713645
0.368815 4.381696
0.782509 9.349428
0.645825 7.790882
0.277391 3.834258
0.092569 3.643274
0.284320 3.609353
0.344465 4.023259
0.182523 3.749195
0.385001 4.426970
0.747609 8.966676
0.188907 3.711018
0.806244 9.610438
0.014211 3.517818
0.574813 7.040672
0.714500 8.525624
0.538982 6.393940
0.384638 4.649362
0.915586 10.936577
0.883513 10.441493
0.804148 9.742851
0.466011 5.833439
0.800574 9.638874
0.654980 8.028558
0.348564 4.064616
0.978595 11.720218
0.915906 10.833902
0.285477 3.818961
0.988631 11.684010
0.531069 6.305005
0.181658 3.806995
0.039657 3.356861
0.893344 10.776799
0.355214 4.263666
0.783508 9.475445
0.039768 3.429691
0.546308 6.472749
0.786882 9.398951
0.168282 3.564189
0.374900 4.399040
0.737767 8.888536
0.059849 3.431537
0.861891 10.246888
0.597578 7.112627
0.126050 3.611641
0.074795 3.609222
0.634401 7.627416
0.831633 9.926548
0.019095 3.470285
0.396533 4.773104
0.794973 9.492009
0.889088 10.420003
0.003174 3.587139
0.176767 3.554071
0.943730 11.227731
0.758564 8.885337
================================================
FILE: classification_and_regression_trees/model_tree.py
================================================
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from regression_tree import *
def linear_regression(dataset):
''' 获取标准线性回归系数
'''
dataset = np.matrix(dataset)
# 分割数据并添加常数列
X_ori, y = dataset[:, :-1], dataset[:, -1]
X_ori, y = np.matrix(X_ori), np.matrix(y)
m, n = X_ori.shape
X = np.matrix(np.ones((m, n+1)))
X[:, 1:] = X_ori
# 回归系数
w = (X.T*X).I*X.T*y
return w, X, y
def fleaf(dataset):
''' 计算给定数据集的线性回归系数
'''
w, _, _ = linear_regression(dataset)
return w
def ferr(dataset):
''' 对给定数据集进行回归并计算误差
'''
w, X, y = linear_regression(dataset)
y_prime = X*w
return np.var(y_prime - y)
def get_nodes_edges(tree, root_node=None):
''' 返回树中所有节点和边
'''
Node = namedtuple('Node', ['id', 'label'])
Edge = namedtuple('Edge', ['start', 'end'])
nodes, edges = [], []
if type(tree) is not dict:
return nodes, edges
if root_node is None:
label = '{}: {}'.format(tree['feat_idx'], tree['feat_val'])
root_node = Node._make([uuid.uuid4(), label])
nodes.append(root_node)
for sub_tree in (tree['left'], tree['right']):
if type(sub_tree) is dict:
node_label = '{}: {}'.format(sub_tree['feat_idx'], sub_tree['feat_val'])
else:
node_label = '{}'.format(np.array(sub_tree.T).tolist()[0])
sub_node = Node._make([uuid.uuid4(), node_label])
nodes.append(sub_node)
edge = Edge._make([root_node, sub_node])
edges.append(edge)
sub_nodes, sub_edges = get_nodes_edges(sub_tree, root_node=sub_node)
nodes.extend(sub_nodes)
edges.extend(sub_edges)
return nodes, edges
def dotify(tree):
''' 获取树的Graphviz Dot文件的内容
'''
content = 'digraph decision_tree {\n'
nodes, edges = get_nodes_edges(tree)
for node in nodes:
content += ' "{}" [label="{}"];\n'.format(node.id, node.label)
for edge in edges:
start, end = edge.start, edge.end
content += ' "{}" -> "{}";\n'.format(start.id, end.id)
content += '}'
return content
def tree_predict(data, tree):
if type(tree) is not dict:
w = tree
y = np.matrix(data)*w
return y[0, 0]
feat_idx, feat_val = tree['feat_idx'], tree['feat_val']
if data[feat_idx+1] < feat_val:
return tree_predict(data, tree['left'])
else:
return tree_predict(data, tree['right'])
if '__main__' == __name__:
dataset = load_data('exp2.txt')
tree = create_tree(dataset, fleaf, ferr, opt={'err_tolerance': 0.1, 'n_tolerance': 4})
# 生成模型树dot文件
with open('exp2.dot', 'w') as f:
f.write(dotify(tree))
dataset = np.array(dataset)
# 绘制散点图
plt.scatter(dataset[:, 0], dataset[:, 1])
# 绘制回归曲线
x = np.sort(dataset[:, 0])
y = [tree_predict([1.0] + [i], tree) for i in x]
plt.plot(x, y, c='r')
plt.show()
================================================
FILE: classification_and_regression_trees/notebook/分段函数回归树.ipynb
================================================
{
"cells": [
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"%matplotlib inline"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from regression_tree import *"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"dataset = load_data('ex0.txt')\n",
"tree = create_tree(dataset, fleaf, ferr)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'feat_idx': 0,\n",
" 'feat_val': 0.40015800000000001,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.20819699999999999,\n",
" 'left': -0.023838155555555553,\n",
" 'right': 1.0289583666666666},\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.609483,\n",
" 'left': 1.980035071428571,\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.81674199999999997,\n",
" 'left': 2.9836209534883724,\n",
" 'right': 3.9871631999999999}}}"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tree"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"\n",
"import numpy as np"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"dataset = np.array(dataset)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<matplotlib.collections.PathCollection at 0x109c20c50>"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAW4AAAD8CAYAAABXe05zAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3X+MHOd5H/Dvc8sluWfHXLq6AtJKNBXDESuaFc86WAwO\naEOltVTqhw9WElm1+gMwIiQtikgRrqBgAaRcFbqCSOQWNdAKiZGkUhVSknugLBV0W9IQypZqjr07\n07TJwrItSiujulRcJtGtyOXe0z925zg3O+/MO7MzszN73w9A4G5vf7zDI59993mf93lFVUFERMUx\nMugBEBFRNAzcREQFw8BNRFQwDNxERAXDwE1EVDAM3EREBcPATURUMAzcREQFw8BNRFQwG9J40uuu\nu063b9+exlMTEQ2l06dP/7mqjtncN5XAvX37dszNzaXx1EREQ0lE3ra9L1MlREQFw8BNRFQwDNxE\nRAXDwE1EVDAM3EREBcPATURUMKmUAxIRBZmdr+PQsfN4r9HEDdUKpu+6BVPjtUEPqzAYuIkoU7Pz\ndTzx7TNottoAgHqjiSe+fQYAGLwtMVVCRJk6dOz8atB2NFttHDp2fkAjKh4GbiLK1HuNZqTbqRcD\nNxFl6oZqJdLt1IuBm4gyNX3XLaiUS2tuq5RLmL7rlgGNqHi4OElEmXIWIG2rSliB0ouBm4gyNzVe\nswq+rEDxx1QJEeWWqQLl0cMLmJw5jtn5+oBGNliccRNRbgVVmvQ7+y5yCoYzbiLKrbBKk7j1304K\npt5oQnHtTaAoM3gGbiLKLb8KFK849d9F3wTEVAkR5Za7AqVuCNBx6r+LvgmIgZuIBios1+xUoHgr\nTID49d83VCu+bwRR3gQGmSNnqoSIBiZKrnlqvIZnvrQLtWoFAqBWreCZL+2KFSz73QQ06Bw5Z9xE\nNDBBuWa/gGxb/x3EmSk3W22URNBWRS3ijDnquJPGwE1EA5NVrtkJ1vVGEwJAu7e3VVEpl7B3xxgO\nHTuPxw4v9KQ9/FIig86RM3AT0cAkkWt2mHLO3ty4eh7XbLXxwqkLq7e768MB+O7crI6WcXG5lci4\n42DgJqKB2btjbE3QBOItOAZtjfdLa3j5BXOnNNAvJbJpwwgq5VIiC6VxMHAT0UDMztfxyul6T9D8\n3LYtPXnisAoOU8758SOLaKv3FewEpT0uNVt49sHdA6sqYeAmolSZgq5pJvw/3voAs/P1NTnmsEZT\nphrvtuqanLYf08+dtIffcys6bxaD2iZvHbhFpARgDkBdVe9Nb0hENCyCgq5pRusERffmG9Ns2hEU\nnP1ud+6/dbSMy602llsra37uTnt4a8cd9UYTjx1ewNzbH+DpqV2GV09HlBn37wD4EYBPpDQWIhoy\nQWVzpoVJYG1QNwX4tiqe+PYZbC6PBM6ovbaOlnHgvp0AnKC8NmhXK2UcvH9nTyrGNPN+/tQFAMg0\neFttwBGRGwHcA+AP0h0OEQ2ToLK56btugRge567OCKrUaLbavtUdQUY3bghM1Xxs04aenZsn999p\nHCsAvHDqQqYNqmx3Tn4DwD8HsBJ2RyIiR9D5klPjNXxlz7aegOitzrBpNBWF82YStRY76A3ESe9k\nJTRwi8i9AN5X1dMh93tEROZEZG5paSmxARJRcYVtLX96aheefXB34DZ2Z6t7SYLmvPacABz10OKg\nTwhAtg2qbHLckwDuF5F9ADYD+ISIPK+qD7vvpKrPAXgOACYmJuLV3xDRUJkar2Hu7Q/w4pvvoK2K\nkggeuL3m20Qq7HkA80KhiXfR0v2mMX3XLZGbVo1uLOHDK/6vn+Up9aEzblV9QlVvVNXtAL4M4Lg3\naBMR+XFqtZ1a6rYqXjldj5UPdmbetirlEr6yZ5txNh+laZVTHWMK2lmfUs86biJKTdLNmKbGa8ZN\nNSMCXL+lErghZna+jsmZ42vuc3L/nbGuwxG1QVUSIgVuVf0egO+lMhIiGjppNGN66I6bVkvw3P7+\nHdsCS/L6OTE+7OzLx48sZlrPzX7cRJSaqAuANp6e2oWH92xbXawsieDhPcFBG+jvuLKw8bZV8fyp\nC3hy9kzg/ZLCwE1Eqen3wAKTp6d24a1n9uFnM/fgrWf2Wc10+5n925YkvvjmO6H3SQJz3ESUGve2\n9SSaMfVzXFjcFrJ+By+YxG1oFRUDNxGlKqjcL0og7idHDcQr//O+pnPwwkettn8PlGRKzUMxVUJE\nAxH13MZ+ctRAvDMrTa9pmldXNmQTUjnjJqJURWnrGlQqaJujDprFR539R61+8TasSgsDNxGlJk5b\n13qj2VNrPTVes8pRx02nmB5nOqLMlOvOavckUyVElJqwtq5+BPBNn9hUqMRNpxhTIgrf13zojptS\nqZaxxcBNRKkJa+vqDX5+ByK40ydhOeq4JX+mn19qtnxf8+mpXZHz5UliqoSIUhOU3vArFQw7WCGs\nIVXckr+wcfq9pk1zrLRwxk00AE7PjJv3v4bJmeOZNuHPUlh6wzmk4Kcz9+Dk/jtR63OnZdwNP2lt\nFEoLAzdRxqKWwRVZ1BK8fgNonJK/fh43KKIp7PSZmJjQubm5xJ+XaBhMzhz3/Vheq1asOtUNO3dZ\n3pZKGSJAY7nV967LvBOR06o6YXNfzriJMpZGx7xh4qRPnn1wNy5fXcHF5dbQfzKJioGbKGNpdMwb\nRv3ulBxmDNxEGSvaQtig8JOJGQM3UcaKthA2KPxkYsY6bqIBGGQNcFHE6ea3XjBwEw2JfnpV51HS\nvbyHCQM30RDot1d1XvGTiT8GbqIMpD0bjtIiddhm5usRAzdRyrKYDUfpVT2MM/P1hlUlRClLqh45\nqL+JbQUGa6OHAwM3UcqSqEcO629iWxvO2ujhwFQJUcrithp1M82UHz+yCMC+AqOfsXhz43t3jOHE\nuSXmygeAgZsoZUnUI5tmxG3VNTnqsMAZdyx+ufHnT11Y/Tlz5dli4CZKmXs2XG80URJZk1e2Obx2\nxHDGIbC2eiSsYiRubbTfjD9oHJQuBm6iDDjBzLaiwzvDNQVtx3uNpu+s+NHDC3j8yALa3YdvHS3j\nwH07I7ePtc2BM1eeDQZuooxEqbW2meG63VCtGB/TdsX8i8stTL+8Ni/uxztz31Ipo9HsPe3ca0ul\nbD1mio+BmygjUSo6os5c9+4YwwuunHOQVlsDUytA7yeDcklQHhG0VoJn/iKRhk0xhQZuEdkM4A0A\nm7r3f1lVD6Q9MKJhE6WiI+jgXD8nzi1FeowptfLY4YWeU9aBTrDfOlrG6MYNeK9bkuinsRw+K6f+\n2dRxXwZwp6reBmA3gLtFZE+6wyIaPlH6cPvdN8h7jWakKhVTaiVoPt1Ybq0e7Gs61JepkmyEBm7t\n+Kvut+Xun+QPqiRaBzaXr/2Xq1bKxj7c3p7dYW6oVjA1XsPkpz9pNY7pu26JNKN3XsP9+PJI78ga\nzRa2D/nJ9XlgtXNSREoisgDgfQD/RVXf9LnPIyIyJyJzS0tLSY+TqNCctMRFVyrh8tWVwMc4Zy8G\nzXCBtbP2F37zl/Hwnm0oBSSbt452ZsVR0tHeTwZT4zV8fLM508rzIdNlFbhVta2quwHcCODzIvJZ\nn/s8p6oTqjoxNjaW9DiJCq3fHiGm1MnW0d5Z+9NTu/DWM/vwjQd3+z5GFTh49Kz1x2bTCT1h+Wz2\nQElPpKoSVW2IyAkAdwP4QTpDIho+/fYICdo44zSf8t7uPOapV8+umenblPUBnVl20JFqNouhrOtO\nR+iMW0TGRKTa/boC4O8COJf2wIiGSXXUf9HOdLuXqWxv/OvfxaOHF4zNp6bGaxjdaD8/c1Is7t2d\npnSHzQIqz4dMh81v9HoAfywiJXQC/RFV/U66wyIaLqaNjyEbIgEAT86ewQunLqymNuqNJqZfWgSk\nU6bn5d3UYzvrrZRLeOD2Gl45Xffd3Qmgp8nU5vKIcaMQz4dMT2jgVtXvAxjPYCxEQ+uSIT1hut0x\nO19fE7QdYRth3MHalNJw12U7s3hTLt5b3+1tMgUA5ZLgYxs34FKzxW6BKePOSaIMxG2neujY+Vi1\nt97SPb+OgAfu29kTWB87vOD7fDZjaLUVH9u0AQsHvhBjxBQFD1IgykCUzTducRb3yiPSU7rn1IQD\nwfnrfnPSXIzMBgM3UQa8G2pMJXZecQLpxzdv6HneqfHa6puH02nQr9Y66o7NJMZL0TFwE2XEvaHG\nySf7nR/pFieQmuqrbWrJvW8wQRt5TOOl9DHHTZSxKCete+u3gw5UcJhmvba15O4acO9Yg1QrZS5G\nZoQzbqKMRd1F6Z6pr4QE7aC8uSmgBzWG8kvxPLxnm2++/uD9OwPHRsnhjJsoY6bdhjZNn4J2K9ZC\nSvCm77oF0y8t9pQSfnjlKmbn68bH+Z1lOfGpT0Y+/oySw8BNlLGSId1hk082lfbZLHROjdd6tr8D\naw9WsGVzMDGlh4GbKGOmHHVY7hqIf9ivw7RwyTK+YmHgJspYzZDuCGrd6tbPbDfuRiDKFy5OEmUs\n7macor82JYczbqKM9ZvuKOprU3JEbdqTRTQxMaFzc3OJPy8R0bASkdOqOmFzX6ZKiIgKhoGbiKhg\nGLiJiAqGi5O0LvkdBcYFOioKBm4aSkGBOUqTJ6I8YqqEho4TmE0H6JqaPD316tkBjJYoOs64aegE\ndd+bGq8Zt3dfXG5h/OvfRWOZZyZSvnHGTUMnrO900Pbui8st31k6UZ4wcNPQMQVm53bb7d2mHtmz\n83VMzhwPPb2GKC0M3DR0TP049u4Yw+TMcTx2eAG2J3J5Z+9h+XOiLDBw09DxO7Xlc9u24IVTF1YD\nrm2nh+ro2tNhop5eQ5QGLk7SUPKem/jY4QX4xeqSCFZUsaVSxl9evoq253SYv/po7ekwtuc2EqWJ\nM24qLNtc86Fj532DNnDt8IKPbdqAjaXe/ElrRdfMpsPy50RZYOCmQvLLNU+/tIjxr3+3J5CHzYad\nxzdbK74/dz+e/awpD9jWlQppcua41eG6gPmMR1vOIbzOTszqaBmqwKXmtXpvgD2uqT9R2royx02F\nFCWn3E/QFgB7d4yt2SJ/cbmFSrmEZx/cjanxGmbn65h+eRGtdud16o0mpl9eBMAt9JSO0FSJiNwk\nIidE5IciclZEfieLgREFySqnrABOnFsKrCR56tWzq0Hb0Wort9BTamxy3FcBPK6qtwLYA+Cfisit\n6Q6LhlVSm1f8cs1pqFUroZUkFw0np5tuJ+pXaOBW1Z+r6v/ufv2XAH4EgJ//KLIkN694a7VtNtSM\nWG66cTiLjqwkobyJlOMWke0AxgG8mcZgaLiFNX8yMbVo9dZqu/PMXpVyqee1g4wI8MyXdq0+vzvH\n7TyfsyhZrZTRaPbOrquVcs9tREmwDtwi8nEArwB4VFX/wufnjwB4BAC2bduW2ABpeISlHGbn63jq\n1bOrKYZqpYx7b7ser5yuh/bO9p5evqVShgjWdPo7dOy8dSWKex9O2Mno9952PZ4/daHnOe697Xqr\n1yKKyqocUETKAL4D4Jiq/n7Y/VkOSH6ilPCFqVUrOLn/zkiPCdpB2c9rmK4rzhhp/Ur0lHcREQB/\nCOBHNkGbyCTJBcU4W8ynxmv4yp5tsE11274Gt8FT1myqSiYB/AMAd4rIQvfPvpTHRUPIWVAs2bbm\nCxB3YXDiU59c0ziqUjb/F7B9DS5eUtZsqkr+u6qKqv5NVd3d/fN6FoOj4eIsMvazIQaIv8XcqWpx\nl+l9ZNjmLrDv281t8JQ17pykTHgP6O2Hu9ojCr+qFtNbiMJ+12PY4iVR0hi4KRN+QTOOWrUSOyBG\nyTnXfNIcQSfHu0sTidLGwE2ZSGKhzkk/BAXQIDdUK77VH4K1M2+/NIf3E4OpLJEoCwzclAnboOnl\nHHTg7sJnG0C9AX7vjrE1NeFAJ0g/cHsNJ84trXkjADplfs5tH16+GmvzEFEaGLgpE9N33dKT4w4L\n2pVyqSefPTlz3CqAPjl7Bi+curD6/PVGE6+crvsGab+A731zMEmqLp0oCgZuyoTfAl5Q0KsZgqop\n5VJvNDE5c3x1tuwO2o5mq40T55ZCN8VEyccnUdpIFBUDN2XGu4AXZ8dhUMB30iabyyPGmbxNrj2r\nXt9EcfHoMhqYOPXPYbsvm612YDtVm00xUTbO+FWfEKWNgZsGxtuatVathNZoux8Tle2mGr83h3JJ\nUPb0heUmGxoUnjlJhWVKtVQrZVy+utKzEPqVPdvw9NQuq+f2KzkEuMmG0hOlyRQDNxWW325MpxIF\nYJClYuFhwQUUd1PJeha21Zx/fzSsGLhzgLvy4uNWc1qPuDiZA0FHehEReXHGnQPD1IifKR+i9HHG\nnQPD0og/yVPciciMgTsHhqURP1M+RNlgqiQH+mnEn6fUxDClfIjyjIE7J+JUR+StGsXUR6RoKR+i\nvGOqpMDylpoYlpQPUd5xxp0TcVIeeUtN8OxFomwwcOdA3JRH1NSE+82hOlqGKnCp2Uo0wHJDDFH6\nmCrJgbgpjyipCW+p3sXlFhrNFsv2iAqIgTsHbFMes/N1TM4cx837X8PkzHEAsG6LGnaqC8v2iIqD\nqZIcMKU8RkQwO1/H1HgNs/N1TL+0iNZKp5tjvdHE9EuLOPTrt4UexQUkf/ILEQ0OZ9w5YDrVpa26\nmsI4ePTsatB2tFYUB4+etXoNm5K86mjZbsBENFCcceeAk9p4/MhizxmGTgqj0fQ/jst0u5ffKete\ncVuz21TE5GmjEFHRccadE1PjNawYImcSKQybI78uWb4JuNn0J2EPE6JkMXDnSFCzqdGy/6/KdLuf\nqfEaTu6/0xi84+xwtKmIOXj0bK42ChEVHQN3jgSV920ynGxuuj3q6wBAY/nKasWK7Ww4rCJmdr5u\nTOdwMZQontDALSLfEpH3ReQHWQxoPQs69byxbMhxG24Pe50Hbq9BPLd/eKUdOZVhmqUrOof5PvWq\nefGUPUyI4rFZnPwjAP8WwJ+kOxQCzDsPk27gdOLcEoLWIp1URtgCYtCip994vY8louhCZ9yq+gaA\nDzIYCwVIuoFTUnXdNouefraOlllVQhRTYjluEXlEROZEZG5paSmppyWXTRuu/bq2jpaNuyRt2MzU\nbWfzU+O1SG8glXIJB+7baX1/IlorscCtqs+p6oSqToyNjSX1tIRr5XTuRb6PWit9PefeHWM9OW6v\n5StXrfLczvhMqpWy1bZ8IrLDDTgFEFRyFycAzs7X8crpek+OuzwCuN8PLi63eroU+m2k8Sv3c1TK\nJRy8fycDNVGCGLgLIGrf7bBdiqaGUysqAPx3bjr9UrztZ939U/xwdk2UvNDALSIvAvgVANeJyLsA\nDqjqH6Y9sPXKL+hGqSix6e1tCvje7fYO5/5+AT8oaNeqFQZtohTYVJU8pKrXq2pZVW9k0E6PaWv4\n3h1j1hUlNjsZTYuOI4akt9N8KuqGGZb7EaWDOydzxBR0T5xbsu67bZNWMZUWuqtW3JyJeJSacZb7\nEaWHOe4cCQq6tkeC2aRVTGdDPnZ4wfc5neZTNh0GHSz3I0oPA3eOBB2ocPP+16zaofoFV7+0it8b\nwaFj5wODvjfgj4j45sU52yZKF1MlORJ0oIJtD5GgfidxXt8b9J0Ogz+duQe/9xu3+d6fs22idHHG\nnSM2M1qb+u24J62bUih+z+VUvzRbbZS646zxgASiTDBw54w76N68/zXf+6TZDtUm6HtLDp03l+Ur\nV1MbFxFdw1RJjgUdrDBIpg08zk5LnmxDlC4G7hxLuiNgUoJm/DzZhih9DNw51s9CY5rCZvw82YYo\nXcxx51zchcY0hdVzDzqVQzTsOOOmyJxPAtVKuedneUjlEA07Bm6KZWq8hoUDX8A3Htydu1QO0bBj\nqoT6ksdUDtGw44ybiKhgGLiJiAqGgZuIqGAYuImICoaBm4ioYBi4iYgKhoGbiKhgGLiJiApm6Dbg\nPDl7Bi+++Q7aqiiJ4KE7bsLTU7sSfx3nIIH3Gk1sqZQhAjSWW1bHixER9SN3gdsdEKMGwSdnz+D5\nUxdWv2+rrn6fZPD2HiTQ6B6mC1w7XgwAgzcRpSJXqRInINYbTeszFt1efPOdSLfHZTpIwMGe1ESU\nplzNuP0Cos0Ziw6/E8eDbjcJm/Xb9JtmT2oiSkuuArcp2HlvdwfW6mgZqsAlV7rCqyRiPQZvGsQv\n9XFDtYJ6SGAeEVn9pBA39UNE5CdXqRKbMxa96ZSLyy00mi0EzakfuuMmq9efna/j8SOLxlm/w+9I\nMa+2KqZfXsT0S4uxUz9ERH5yNeP2O1nF25g/LL/s5q4qcWbp9UYTJRG0VVFzzYCdNwRTWsU963dm\nzO6qkr/4qIUVz0Nb7d7narbaePTwAr72n86gXBrBpSYrUYgomlwFbm9AjJtfFgA/nbln9Xtv+sMJ\nzu40SNgbgvfTgLsP9ex8HY8eXrC4wms+vNIGYE7HEBGZWAVuEbkbwL8GUALwB6o6k9aAwhrz2+SX\nvUH24NGzxqDcbLVx8OjZwBx50HFczptCv5xxMHATUZjQHLeIlAB8E8DfA3ArgIdE5Na0BzY7X8fk\nzHHcvP81TM4cX80Lh+WXvUF2dr6+ps7aT6PZQnW09/xEoJNueeD2Gg4dO98zFiB4pl4akUiLCI1m\ni/lvIgplE1c+D+DHqvoTVb0C4E8BfDHNQQXVczsH1TrnHG4dLaNaKRvPPLStp1aF7xuCs4nHPZbH\nDi/gydnOLDsodbOxJCiV7CtaooyXiNYvm1RJDYB7B8u7AO5IZzgdpnru3z3SySNHOefQtp76UrOF\nZx/cvbqAGUQBPH/qAr6z+PPAapZma8Xqtd1Y/01EYRIrBxSRR0RkTkTmlpaW+nouU/BaUWD65cVI\n6QRTiaHf/abGazi5/07ULB8TloKJw3a8RLR+2QTuOgB3IfSN3dvWUNXnVHVCVSfGxsb6GlRQ8Gq1\nFU+9etb6uWxqrp28+JOzZ/DpJ14PnXHbqJRL2GrImwNAuSQoj6xNowQtghIROWwC958B+IyI3Cwi\nGwF8GcDRNAcVFrwuLtsv4nlz4rVqBQ/v2bbm+2e+tAtzb3+A509diLw93sv9nAfu2+n7prF1tIxD\nv3YbDv36bT3jYFUJEYURtQhUIrIPwDfQKQf8lqr+y6D7T0xM6NzcXF8D2/3UdwNTEbVqBSf337nm\ntn46C376idf7DtpJj4mI1g8ROa2qEzb3tarjVtXXAbze16giOnj/Tky/tIiWdztil1//krAeI0H6\nDdqmNEeUhVQiIhu56lXiNjVew4OfN/cY8ebBgzoL2ojSiMpLXK/FOmwiSluutry7zc7X8cpp/yDo\nN7u17Szo9zqHjp3va8btPJJb14koC7mdcZt2JJZEfBfxbDoLerk3+vi9zuSnP7lm8dAGD1EgorTl\ndsZtmimbZsY2nQW9TG8OfouMADA5c9yqVJCbaIgoTbmdcQfNlP16WvuV/YWV10VNr9jUhIeNnYio\nX7mdcfvNoB2m48yiVnCYOg0GBd5NG0ZWx7RpwwguX+3d1r53R38bkIiIguR2xu3MoE2SSEf4zaBN\n6RUnH+6uLb/iE7QB4MS5/rb8ExEFyW3gBjrB27QomEQ6Ikp6xS8fbqpDYY6biNKU21SJI86iYxS2\n6ZUowZg5biJKU65n3EC8Rcc0mIKxd9sOG0URUdpyP+MG8rFt3DTzf+D2Gk6cW2IvEiLKTCECdx7Y\nHGRMRJSFwgfuLLvv5WHmT0RU6MDdb0dAIqIiyv3iZJB+OwISERVRoQN33I6ARERFVujAHacjIBFR\n0RU6cEfZsk5ENCwKvTjJEj0iWo8KHbgBlugR0fpT6FQJEdF6xMBNRFQwDNxERAXDwE1EVDAM3ERE\nBcPATURUMKJqOoCrjycVWQLwdh9PcR2AP09oOEXA6x1+6+2aeb3RfUpVrU4aTyVw90tE5lR1YtDj\nyAqvd/itt2vm9aaLqRIiooJh4CYiKpi8Bu7nBj2AjPF6h996u2Zeb4pymeMmIiKzvM64iYjIYKCB\nW0TuFpHzIvJjEdnv8/NNInK4+/M3RWR79qNMjsX1/q6I/FBEvi8i/01EPjWIcSYl7Hpd93tARFRE\nCl2FYHO9IvIb3d/xWRH5j1mPMUkW/563icgJEZnv/pveN4hxJkVEviUi74vIDww/FxH5N92/j++L\nyOdSG4yqDuQPgBKAtwD8IoCNABYB3Oq5zz8B8O+6X38ZwOFBjTej690LYLT79W8P+/V27/cLAN4A\ncArAxKDHnfLv9zMA5gFs7X7/1wc97pSv9zkAv939+lYAPxv0uPu85r8F4HMAfmD4+T4A/xmAANgD\n4M20xjLIGffnAfxYVX+iqlcA/CmAL3ru80UAf9z9+mUAvyoikuEYkxR6vap6QlWXu9+eAnBjxmNM\nks3vFwD+BYB/BeCjLAeXApvr/U0A31TViwCgqu9nPMYk2VyvAvhE9+stAN7LcHyJU9U3AHwQcJcv\nAvgT7TgFoCoi16cxlkEG7hqAd1zfv9u9zfc+qnoVwCUAfy2T0SXP5nrdvorOu3dRhV5v96PkTar6\nWpYDS4nN7/eXAPySiJwUkVMicndmo0uezfUeBPCwiLwL4HUA/yyboQ1M1P/jsRX+BJxhJCIPA5gA\n8LcHPZa0iMgIgN8H8I8HPJQsbUAnXfIr6HyaekNEdqlqY6CjSs9DAP5IVX9PRH4ZwH8Qkc+q6sqg\nB1Z0g5xx1wHc5Pr+xu5tvvcRkQ3ofNz6f5mMLnk21wsR+TsAvgbgflW9nNHY0hB2vb8A4LMAvici\nP0MnJ3i0wAuUNr/fdwEcVdWWqv4UwP9BJ5AXkc31fhXAEQBQ1f8JYDM6PT2GldX/8SQMMnD/GYDP\niMjNIrIRncXHo577HAXwj7pf/xqA49pdBSig0OsVkXEA/x6doF3k/CcQcr2qeklVr1PV7aq6HZ2c\n/v2qOjeY4fbN5t/zLDqzbYjIdeikTn6S5SATZHO9FwD8KgCIyN9AJ3AvZTrKbB0F8A+71SV7AFxS\n1Z+n8koDXqXdh86s4y0AX+ve9nV0/gMDnV/0SwB+DOB/AfjFQY43g+v9rwD+L4CF7p+jgx5zmtfr\nue/3UOAjXupkAAAAeElEQVSqEsvfr6CTHvohgDMAvjzoMad8vbcCOIlOxckCgC8Mesx9Xu+LAH4O\noIXOp6evAvgtAL/l+v1+s/v3cSbNf8/cOUlEVDDcOUlEVDAM3EREBcPATURUMAzcREQFw8BNRFQw\nDNxERAXDwE1EVDAM3EREBfP/ATWTEFlkthKIAAAAAElFTkSuQmCC\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x109b9d390>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plt.scatter(dataset[:, 0], dataset[:, 1])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 绘制树回归曲线"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[<matplotlib.lines.Line2D at 0x109cca748>]"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAD8CAYAAACMwORRAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAFhVJREFUeJzt3X2MXXWdx/H3h1JKkVIeOpTSdjogNdnKuoAjAppdVlat\n1bQm1E3ZqGBwG1lZJZqsoknV7l8krm4IRLYRYnFdRfEhI1tC2AWCGqkUKOVpMcPMnT7YJ9rSWukD\nU777xz3Dzl7u9J6598ycc+79vJKbe+45v7n3ezrTz5w593vuTxGBmZm1lxPyLsDMzLLncDcza0MO\ndzOzNuRwNzNrQw53M7M25HA3M2tDDnczszbkcDcza0MOdzOzNnRiXi88a9as6OnpyevlzcxK6Ykn\nnng5Iroajcst3Ht6etiwYUNeL29mVkqShtKM82kZM7M25HA3M2tDDnczszbkcDcza0Opw13SFElP\nSbqvzrZpku6R1C9pvaSeLIs0M7PxGc+R++eBF8bYdj2wLyIuAL4N3NJqYWZm1rxU4S5pHvBh4Ltj\nDFkGrE2W7wWukqTWyzMzs2ak7XP/V+CfgBljbJ8LbAGIiGFJ+4GzgJdbrtDMDOCpp+DnP8+7imy8\n973wgQ9M6Es0DHdJHwF2RcQTkq5s5cUkrQRWAnR3d7fyVGbWaVatgvvug3Y4KfClL+Uf7sB7gKWS\nlgAnA6dJ+veI+PioMduA+cBWSScCM4E9tU8UEWuANQC9vb2emdvM0hschI9+tH2O3idYw3PuEXFz\nRMyLiB5gBfBQTbAD9AHXJsvLkzEObzPLRgRUKuDPo0qt6c+WkbQa2BARfcCdwPcl9QN7qf4SMDPL\nxp498Kc/OdzHYVzhHhGPAI8ky6tGrT8MfCzLwszM3lCpVO8d7qn5ClUzK77Bweq9wz01h7uZFZ+P\n3MfN4W5mxVepwBlnwMyZeVdSGg53Mys+d8qMm8PdzIrP4T5uDnczKzb3uDfF4W5mxbZ7N7z6qsN9\nnBzuZlZs7pRpisPdzIptJNzPOy/XMsrG4W5mxTYS7gsW5FpG2TjczazYKhU480w47bS8KykVh7uZ\nFZs7ZZricDezYnO4N8XhbmbF5R73pjnczay4du2CQ4cc7k1wuJtZcbkNsmkNw13SyZJ+J+lpSc9J\n+kadMddJ2i1pY3L79MSUa2YdxRcwNS3NTExHgPdFxEFJU4FfS7o/Ih6rGXdPRNyYfYlm1rHc4960\nhuGeTHR9MHk4Nbl58mszm3iVCpx1FsyYkXclpZPqnLukKZI2AruAByNifZ1hV0vaJOleSfPHeJ6V\nkjZI2rB79+4WyjazjjA46FMyTUoV7hFxLCIuAuYBl0q6sGbIL4GeiHgH8CCwdoznWRMRvRHR29XV\n1UrdZtYJ3AbZtHF1y0TEK8DDwOKa9Xsi4kjy8LvAO7Mpz8w6VgQMDblTpklpumW6JJ2eLE8H3g/8\nT82YOaMeLgVeyLJIM+tAO3fC4cM+cm9Smm6ZOcBaSVOo/jL4cUTcJ2k1sCEi+oDPSVoKDAN7gesm\nqmAz6xBug2xJmm6ZTcDFddavGrV8M3BztqWZWUdzuLfEV6iaWTG5x70lDnczK6bBQZg1C049Ne9K\nSsnhbmbF5DbIljjczayYKhW3QbbA4W5mxfP669Uedx+5N83hbmbFs3MnHDnicG+Bw93MisdtkC1z\nuJtZ8TjcW+ZwN7PiGRys3rvHvWkOdzMrnkoFurrgLW/Ju5LScribWfG4DbJlDnczKx5fwNQyh7uZ\nFYt73DPhcDezYtmxA44edbi3yOFuZsUy0injcG9JmpmYTpb0O0lPS3pO0jfqjJkm6R5J/ZLWS+qZ\niGLNrAO4xz0TaY7cjwDvi4i/AC4CFku6rGbM9cC+iLgA+DZwS7ZlmlnHcLhnomG4R9XB5OHU5BY1\nw5YBa5Ple4GrJCmzKs2sc1QqMHs2TJ+edyWllmYOVZL5U58ALgBuj4j1NUPmAlsAImJY0n7gLODl\nDGs16yzDw3DoUN5VTL6XXvJRewZShXtEHAMuknQ68HNJF0bEs+N9MUkrgZUA3d3d4/1ys84RAQsX\n/t8pik6zYkXeFZReqnAfERGvSHoYWAyMDvdtwHxgq6QTgZnAnjpfvwZYA9Db21t7asfMRuzcWQ32\n5cvhstq3uDrA0qV5V1B6DcNdUhfwWhLs04H38+Y3TPuAa4HfAsuBhyLC4W3WrJEj9uuugw9/OM9K\nrKTSHLnPAdYm591PAH4cEfdJWg1siIg+4E7g+5L6gb2A/6Yya4V7va1FDcM9IjYBF9dZv2rU8mHg\nY9mWZtbB3A5oLfIVqmZF5I+8tRY53M2KyJ+KaC1yuJsVkcPdWuRwNysaf+StZcDhblY0O3bAkSMO\nd2uJw92saNwpYxlwuJsVzUi4ew5Ra4HD3axoRsJ9wYJcy7Byc7ibFU2lAmefDaecknclVmIOd7Oi\ncRukZcDhblY0DnfLgMPdrEjc424ZcbibFcn27XD0qMPdWuZwNysSt0FaRhzuZkXiC5gsIw53syJx\nj7tlpGG4S5ov6WFJz0t6TtLn64y5UtJ+SRuT26p6z2VmDVQqMHs2TJ+edyVWcmmm2RsGvhgRT0qa\nATwh6cGIeL5m3K8i4iPZl2jWQdwGaRlpeOQeEdsj4slk+Y/AC8DciS7MrCMNDjrcLRPjOucuqYfq\nfKrr62y+XNLTku6X9PYMajPrLMeOwebNDnfLRJrTMgBIOhX4KXBTRByo2fwksCAiDkpaAvwCWFjn\nOVYCKwG6u7ubLtqsLW3fDq+95jZIy0SqI3dJU6kG+w8i4me12yPiQEQcTJbXAVMlzaozbk1E9EZE\nb1dXV4ulm7UZt0FahtJ0ywi4E3ghIr41xphzknFIujR53j1ZFmrW9hzulqE0p2XeA3wCeEbSxmTd\nV4BugIi4A1gO3CBpGDgErIiImIB6zdrXSLj7lKVloGG4R8SvATUYcxtwW1ZFmXWkSgXOOcc97pYJ\nX6FqVhRug7QMOdzNiqJScaeMZcbhblYE7nG3jDnczYrgD3+A4WGHu2XG4W5WBG6DtIw53M2KwOFu\nGXO4mxXB4GD13j3ulhGHu1kRVCowZw6cfHLelVibcLibFYHbIC1jDnezIvAkHZYxh7tZ3oaHYcsW\nh7tlyuFuljf3uNsEcLib5c1tkDYBHO5meRtpg3S4W4Yc7mZ5q1RAco+7Zcrhbpa3SgXOPRemTcu7\nEmsjaabZmy/pYUnPS3pO0ufrjJGkWyX1S9ok6ZKJKdesDbkN0iZAmiP3YeCLEbEIuAz4rKRFNWM+\nBCxMbiuB72RapVk7c7jbBGgY7hGxPSKeTJb/CLwAzK0Ztgy4O6oeA06XNCfzas3ajXvcbYKkmSD7\nDZJ6gIuB9TWb5gJbRj3emqzbXvP1K6ke2dPtN48sjQhYvRp27Mi7kolx6FB1og6Hu2UsdbhLOhX4\nKXBTRBxo5sUiYg2wBqC3tzeaeQ7rML//PXz96zBzZvu+4bhgAVxxRd5VWJtJFe6SplIN9h9ExM/q\nDNkGzB/1eF6yzqw1AwPV+3XrHIBm45CmW0bAncALEfGtMYb1AZ9MumYuA/ZHxPYxxpqlNxLu55+f\nbx1mJZPmyP09wCeAZyRtTNZ9BegGiIg7gHXAEqAfeBX4VPalWkcaGIDp02H27LwrMSuVhuEeEb8G\n1GBMAJ/NqiizNwwMVI/addwfQTOr4StUrdhGwt3MxsXhbsUV4XA3a5LD3Yrr5Zfh4EGHu1kTHO5W\nXO6UMWuaw92Ky+Fu1jSHuxXXSLj70nyzcXO4W3ENDMCcOXDKKXlXYlY6DncrrsFBn5Ixa5LD3YrL\nbZBmTXO4WzEdPVr9nHOHu1lTHO5WTJs3w+uvw3nn5V2JWSk53K2Y3AZp1hKHuxWTw92sJQ53K6aB\ngerMS3M8Fa9ZMxzuVkwDA9Xz7Sf4R9SsGWlmYrpL0i5Jz46x/UpJ+yVtTG6rsi/TOo7bIM1akuaw\n6HvA4gZjfhURFyW31a2XZR0tAl56yeFu1oKG4R4RjwJ7J6EWs6p9++DAAYe7WQuyOqF5uaSnJd0v\n6e0ZPad1KnfKmLUszQTZjTwJLIiIg5KWAL8AFtYbKGklsBKgu7s7g5e2tuRwN2tZy0fuEXEgIg4m\ny+uAqZJmjTF2TUT0RkRvV1dXqy9t7Wok3H11qlnTWg53SedI1anpJV2aPOeeVp/XOtjgIJx9Npx6\nat6VmJVWw9Mykn4IXAnMkrQV+BowFSAi7gCWAzdIGgYOASsiIiasYmt/boM0a1nDcI+Iaxpsvw24\nLbOKzAYG4N3vzrsKs1Lz5X9WLMPDMDTkI3ezFjncrVi2bIFjxxzuZi1yuFuxuA3SLBMOdysWh7tZ\nJhzuViwDAzB1Ksydm3clZqXmcLdiGRiAnh6YMiXvSsxKzeFuxeIed7NMONytWBzuZplwuFtxvPIK\n7N3rcDfLgMPdimNwsHrvcDdrmcPdisPhbpYZh7sVhz/q1ywzDncrjoEBOPNMmDkz70rMSs/hbsXh\nThmzzDjcrTgc7maZcbhbMRw7BpWKw90sIw3DXdJdknZJenaM7ZJ0q6R+SZskXZJ9mdb2tm2D115z\nuJtlpOFMTMD3qM60dPcY2z8ELExu7wa+k9zbZIiAm26CF1/Mu5LW7N9fvXe4m2UizTR7j0rqOc6Q\nZcDdybypj0k6XdKciNieUY12PPv2wa23Vj9sa/bsvKtpzQc/CO96V95VmLWFNEfujcwFtox6vDVZ\n96Zwl7QSWAnQ3d2dwUsbQ0PV+29+E66+Ot9azKwwJvUN1YhYExG9EdHb1dU1mS/dvjZvrt4vWJBv\nHWZWKFmE+zZg/qjH85J1NhlGjtwd7mY2Shbh3gd8MumauQzY7/Ptk2hoCKZPh1mz8q7EzAqk4Tl3\nST8ErgRmSdoKfA2YChARdwDrgCVAP/Aq8KmJKtbq2LwZurtByrsSMyuQNN0y1zTYHsBnM6vIxmdo\nqBruZmaj+ArVstu82efbzexNHO5ldvgw7NzpcDezN3G4l9lIG6RPy5hZDYd7mbnH3czG4HAvM/e4\nm9kYHO5lNjQEJ5wAc+fmXYmZFYzDvcw2b4Zzz4WpU/OuxMwKxuFeZkNDPiVjZnU53MvMFzCZ2Rgc\n7mV17Bhs3eojdzOry+FeVjt2VKelc7ibWR0O97IaaYP0aRkzq8PhXla+gMnMjsPhXlY+cjez43C4\nl9XQEJxxBsyYkXclZlZAqcJd0mJJL0rql/TlOtuvk7Rb0sbk9unsS7X/xx/1a2bHkWYmpinA7cD7\nga3A45L6IuL5mqH3RMSNE1Cj1TM0BG99a95VmFlBpTlyvxToj4iBiDgK/AhYNrFlWUMj0+uZmdWR\nJtznAltGPd6arKt1taRNku6VND+T6qy+V16BAwd8WsbMxpTVG6q/BHoi4h3Ag8DaeoMkrZS0QdKG\n3bt3Z/TSHcidMmbWQJpw3waMPhKfl6x7Q0TsiYgjycPvAu+s90QRsSYieiOit6urq5l6DdzjbmYN\npQn3x4GFks6TdBKwAugbPUDSnFEPlwIvZFeivYkn6TCzBhp2y0TEsKQbgQeAKcBdEfGcpNXAhojo\nAz4naSkwDOwFrpvAmm1oCKZNA//1Y2ZjaBjuABGxDlhXs27VqOWbgZuzLc3GNNIpc4KvQTOz+pwO\nZeRJOsysAYd7GXmSDjNrwOFeNkeOVD/L3UfuZnYcDvey2ZJcT+ZwN7PjcLiXjS9gMrMUHO5l4wuY\nzCwFh3vZDA2BBPPm5V2JmRWYw71shoZgzhw46aS8KzGzAnO4l40n6TCzFBzuZeMLmMwsBYd7mbz+\nerUV0p0yZtaAw71Mdu6Eo0d95G5mDTncy8Qf9WtmKTncy2Skx92nZcysAYd7mfjI3cxScriXydAQ\nzJwJp52WdyVmVnCpwl3SYkkvSuqX9OU626dJuifZvl5ST9aFGu5xN7PUGoa7pCnA7cCHgEXANZIW\n1Qy7HtgXERcA3wZuybpQwz3uZpZamiP3S4H+iBiIiKPAj4BlNWOWAWuT5XuBqyQpuzIN8CQdZpZa\nmnCfC2wZ9Xhrsq7umIgYBvYDZ2VRoCX276/efORuZimkmiA7K5JWAisBups9An3gAfjCFzKsqiSO\nHq3eO9zNLIU04b4NmD/q8bxkXb0xWyWdCMwE9tQ+UUSsAdYA9Pb2RjMFc9ppsKj2lH+HuOIKuOqq\nvKswsxJIE+6PAwslnUc1xFcAf1czpg+4FvgtsBx4KCKaC+9GLr8cfvKTCXlqM7N20TDcI2JY0o3A\nA8AU4K6IeE7SamBDRPQBdwLfl9QP7KX6C8DMzHKS6px7RKwD1tWsWzVq+TDwsWxLMzOzZvkKVTOz\nNuRwNzNrQw53M7M25HA3M2tDDnczszbkcDcza0OaqGuNGr6wtBsYavLLZwEvZ1hOGXifO4P3uTO0\nss8LIqKr0aDcwr0VkjZERG/edUwm73Nn8D53hsnYZ5+WMTNrQw53M7M2VNZwX5N3ATnwPncG73Nn\nmPB9LuU5dzMzO76yHrmbmdlxFDrcJS2W9KKkfklfrrN9mqR7ku3rJfVMfpXZSrHPX5D0vKRNkv5b\nUumnZmq0z6PGXS0pJJW+syLNPkv62+R7/Zyk/5jsGrOW4me7W9LDkp5Kfr6X5FFnViTdJWmXpGfH\n2C5Jtyb/HpskXZJpARFRyBvVz45/CTgfOAl4GlhUM+YfgDuS5RXAPXnXPQn7/NfAKcnyDZ2wz8m4\nGcCjwGNAb951T8L3eSHwFHBG8vjsvOuehH1eA9yQLC8CKnnX3eI+/yVwCfDsGNuXAPcDAi4D1mf5\n+kU+cr8U6I+IgYg4CvwIWFYzZhmwNlm+F7hKkiaxxqw13OeIeDgiXk0ePkZ12sMyS/N9Bvhn4Bbg\n8GQWN0HS7PPfA7dHxD6AiNg1yTVmLc0+B3BasjwT+MMk1pe5iHiU6uRFY1kG3B1VjwGnS5qT1esX\nOdznAltGPd6arKs7JiKGgf3AWZNS3cRIs8+jXU/1N3+ZNdzn5M/V+RHxn5NZ2ARK831+G/A2Sb+R\n9JikxZNW3cRIs89fBz4uaSvVyYH+cXJKy814/7+PS6qZmKx4JH0c6AX+Ku9aJpKkE4BvAdflXMpk\nO5HqqZkrqf519qikP4+IV3KtamJdA3wvIv5F0uVUp+68MCJez7uwMirykfs2YP6ox/OSdXXHSDqR\n6p9yeyaluomRZp+R9DfAV4GlEXFkkmqbKI32eQZwIfCIpArVc5N9JX9TNc33eSvQFxGvRcQg8Huq\nYV9Wafb5euDHABHxW+Bkqp/B0q5S/X9vVpHD/XFgoaTzJJ1E9Q3TvpoxfcC1yfJy4KFI3qkoqYb7\nLOli4N+oBnvZz8NCg32OiP0RMSsieiKih+r7DEsjYkM+5WYizc/2L6getSNpFtXTNAOTWWTG0uzz\nZuAqAEl/RjXcd09qlZOrD/hk0jVzGbA/IrZn9ux5v6Pc4N3mJVSPWF4CvpqsW031PzdUv/k/AfqB\n3wHn513zJOzzfwE7gY3JrS/vmid6n2vGPkLJu2VSfp9F9XTU88AzwIq8a56EfV4E/IZqJ81G4AN5\n19zi/v4Q2A68RvUvseuBzwCfGfU9vj3593gm659rX6FqZtaGinxaxszMmuRwNzNrQw53M7M25HA3\nM2tDDnczszbkcDcza0MOdzOzNuRwNzNrQ/8LfR/vUvqwLfQAAAAASUVORK5CYII=\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x109c6def0>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"x = np.linspace(0, 1, 50)\n",
"y = [tree_predict([i], tree) for i in x]\n",
"plt.plot(x, y, c='r')"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
================================================
FILE: classification_and_regression_trees/notebook/后剪枝.ipynb
================================================
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from prune import *"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 加载数据"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"data = load_data('ex2.txt')\n",
"tree = create_tree(data, fleaf, ferr)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 判断树结构"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"not_tree(tree)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 对树结构进行塌陷处理"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"53.136107929136443"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"collapse(tree)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 输出树结构"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'feat_idx': 0,\n",
" 'feat_val': 0.50854200000000005,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.46324100000000001,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.13062599999999999,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.085111000000000006,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.053763999999999999,\n",
" 'left': 4.0916259999999998,\n",
" 'right': -2.5443927142857148},\n",
" 'right': 6.5098432857142843},\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.37738300000000002,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.3417,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.32889000000000002,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.30031799999999997,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.17652300000000001,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.156273,\n",
" 'left': -6.2479000000000013,\n",
" 'right': -12.107972500000001},\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.20399300000000001,\n",
" 'left': 3.4496025000000001,\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.21832099999999999,\n",
" 'left': -11.822278500000001,\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.228628,\n",
" 'left': 6.770429,\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.26463900000000001,\n",
" 'left': -13.070501,\n",
" 'right': 0.40377471428571476}}}}},\n",
" 'right': -19.994155200000002},\n",
" 'right': 15.059290750000001},\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.35147800000000001,\n",
" 'left': -22.693879600000002,\n",
" 'right': -15.085111749999999}},\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.44619599999999998,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.41894300000000001,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.388789,\n",
" 'left': 3.6584772500000016,\n",
" 'right': -0.89235549999999952},\n",
" 'right': 14.38417875},\n",
" 'right': -12.558604833333334}}},\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.48380299999999998,\n",
" 'left': 3.4331330000000007,\n",
" 'right': 12.50675925}},\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.73163599999999995,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.64237299999999997,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.61886799999999997,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.58541299999999996,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.56030100000000005,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.53194399999999997,\n",
" 'left': 101.73699325000001,\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.546601,\n",
" 'left': 110.979946,\n",
" 'right': 109.38961049999999}},\n",
" 'right': 97.200180249999988},\n",
" 'right': 123.2101316},\n",
" 'right': 93.673449714285724},\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.66785099999999997,\n",
" 'left': 114.15162428571431,\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.70889000000000002,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.69891999999999999,\n",
" 'left': 108.92921799999999,\n",
" 'right': 104.82495374999999},\n",
" 'right': 114.554706}}},\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.95390200000000003,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.76332800000000001,\n",
" 'left': 78.085643250000004,\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.79819799999999996,\n",
" 'left': 102.35780185714285,\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.83858699999999997,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.81521500000000002,\n",
" 'left': 88.784498800000009,\n",
" 'right': 81.110151999999999},\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.94882200000000005,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.85642099999999999,\n",
" 'left': 95.275843166666661,\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.912161,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.89668300000000001,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.88361500000000004,\n",
" 'left': 102.25234449999999,\n",
" 'right': 95.181792999999999},\n",
" 'right': 104.82540899999999},\n",
" 'right': 96.452866999999998}},\n",
" 'right': 87.310387500000004}}}},\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.96039799999999997,\n",
" 'left': 112.42895575000001,\n",
" 'right': 105.24862350000001}}}}"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tree"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 使用测试数据进行后剪枝"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"data_test = load_data('ex2test.txt')"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"merged\n",
"merged\n",
"merged\n",
"merged\n",
"merged\n",
"merged\n",
"merged\n",
"merged\n"
]
}
],
"source": [
"pruned_tree = postprune(tree, data_test)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'feat_idx': 0,\n",
" 'feat_val': 0.50854200000000005,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.46324100000000001,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.13062599999999999,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.085111000000000006,\n",
" 'left': 0.77361664285714249,\n",
" 'right': 6.5098432857142843},\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.37738300000000002,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.3417,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.32889000000000002,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.30031799999999997,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.17652300000000001,\n",
" 'left': -9.1779362500000019,\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.20399300000000001,\n",
" 'left': 3.4496025000000001,\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.21832099999999999,\n",
" 'left': -11.822278500000001,\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.228628,\n",
" 'left': 6.770429,\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.26463900000000001,\n",
" 'left': -13.070501,\n",
" 'right': 0.40377471428571476}}}}},\n",
" 'right': -19.994155200000002},\n",
" 'right': 15.059290750000001},\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.35147800000000001,\n",
" 'left': -22.693879600000002,\n",
" 'right': -15.085111749999999}},\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.44619599999999998,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.41894300000000001,\n",
" 'left': 1.3830608750000011,\n",
" 'right': 14.38417875},\n",
" 'right': -12.558604833333334}}},\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.48380299999999998,\n",
" 'left': 3.4331330000000007,\n",
" 'right': 12.50675925}},\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.73163599999999995,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.64237299999999997,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.61886799999999997,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.58541299999999996,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.56030100000000005,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.53194399999999997,\n",
" 'left': 101.73699325000001,\n",
" 'right': 110.18477824999999},\n",
" 'right': 97.200180249999988},\n",
" 'right': 123.2101316},\n",
" 'right': 93.673449714285724},\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.66785099999999997,\n",
" 'left': 114.15162428571431,\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.70889000000000002,\n",
" 'left': 106.87708587499999,\n",
" 'right': 114.554706}}},\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.95390200000000003,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.76332800000000001,\n",
" 'left': 78.085643250000004,\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.79819799999999996,\n",
" 'left': 102.35780185714285,\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.83858699999999997,\n",
" 'left': 84.947325400000011,\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.94882200000000005,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.85642099999999999,\n",
" 'left': 95.275843166666661,\n",
" 'right': {'feat_idx': 0,\n",
" 'feat_val': 0.912161,\n",
" 'left': {'feat_idx': 0,\n",
" 'feat_val': 0.89668300000000001,\n",
" 'left': 98.717068749999996,\n",
" 'right': 104.82540899999999},\n",
" 'right': 96.452866999999998}},\n",
" 'right': 87.310387500000004}}}},\n",
" 'right': 108.838789625}}}"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pruned_tree"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 生成树结构dot文件用于显示"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"with open('ex2_prune.dot', 'w') as f:\n",
" content = dotify(pruned_tree)\n",
" f.write(content)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1m\u001b[36m__pycache__\u001b[m\u001b[m/ ex2test.txt\r\n",
"\u001b[1m\u001b[36mdot\u001b[m\u001b[m/ \u001b[1m\u001b[36mpic\u001b[m\u001b[m/\r\n",
"ex0.txt prune.py\r\n",
"ex00.txt regression_tree.py\r\n",
"ex2.txt 后剪枝.ipynb\r\n",
"ex2_prune.dot 分段函数回归树.ipynb\r\n"
]
}
],
"source": [
"ls"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
================================================
FILE: classification_and_regression_trees/notebook/模型树对分段线性函数进行回归.ipynb
================================================
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"%matplotlib inline"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from model_tree import *"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 加载数据"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"dataset = load_data('exp2.txt')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 创建模型树"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'feat_idx': 0, 'feat_val': 0.30440099999999998, 'left': matrix([[ 3.46877936],\n",
" [ 1.18521743]]), 'right': matrix([[ 1.69855694e-03],\n",
" [ 1.19647739e+01]])}"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tree = create_tree(dataset, fleaf, ferr, opt={'err_tolerance': 0.1, 'n_tolerance': 4})\n",
"tree"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 绘制回归曲线"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXwAAAD8CAYAAAB0IB+mAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAADU9JREFUeJzt3GGI5Hd9x/H3xztTaYym9FaQu9Ok9NJ42ELSJU0Raoq2\nXPLg7oFF7iBYJXhgGylVhBRLlPjIhloQrtWTilXQGH0gC57cA40ExAu3ITV4FyLb03oXhawxzZOg\nMe23D2bSna53mX92Z3cv+32/4GD+//ntzJcfe++dndmZVBWSpO3vFVs9gCRpcxh8SWrC4EtSEwZf\nkpow+JLUhMGXpCamBj/JZ5M8meT7l7g+ST6ZZCnJo0lunP2YkqT1GvII/3PAgRe5/lZg3/jfUeBf\n1j+WJGnWpga/qh4Efv4iSw4Bn6+RU8DVSV4/qwElSbOxcwa3sRs4P3F8YXzup6sXJjnK6LcArrzy\nyj+8/vrrZ3D3ktTHww8//LOqmlvL184i+INV1XHgOMD8/HwtLi5u5t1L0stekv9c69fO4q90ngD2\nThzvGZ+TJF1GZhH8BeBd47/WuRl4pqp+7ekcSdLWmvqUTpIvAbcAu5JcAD4CvBKgqj4FnABuA5aA\nZ4H3bNSwkqS1mxr8qjoy5foC/npmE0mSNoTvtJWkJgy+JDVh8CWpCYMvSU0YfElqwuBLUhMGX5Ka\nMPiS1ITBl6QmDL4kNWHwJakJgy9JTRh8SWrC4EtSEwZfkpow+JLUhMGXpCYMviQ1YfAlqQmDL0lN\nGHxJasLgS1ITBl+SmjD4ktSEwZekJgy+JDVh8CWpCYMvSU0YfElqwuBLUhMGX5KaMPiS1ITBl6Qm\nDL4kNWHwJamJQcFPciDJ40mWktx1kevfkOSBJI8keTTJbbMfVZK0HlODn2QHcAy4FdgPHEmyf9Wy\nvwfur6obgMPAP896UEnS+gx5hH8TsFRV56rqOeA+4NCqNQW8Znz5tcBPZjeiJGkWhgR/N3B+4vjC\n+NykjwK3J7kAnADef7EbSnI0yWKSxeXl5TWMK0laq1m9aHsE+FxV7QFuA76Q5Nduu6qOV9V8Vc3P\nzc3N6K4lSUMMCf4TwN6J4z3jc5PuAO4HqKrvAq8Cds1iQEnSbAwJ/mlgX5Jrk1zB6EXZhVVrfgy8\nDSDJmxgF3+dsJOkyMjX4VfU8cCdwEniM0V/jnElyT5KD42UfBN6b5HvAl4B3V1Vt1NCSpJdu55BF\nVXWC0Yuxk+funrh8FnjLbEeTJM2S77SVpCYMviQ1YfAlqQmDL0lNGHxJasLgS1ITBl+SmjD4ktSE\nwZekJgy+JDVh8CWpCYMvSU0YfElqwuBLUhMGX5KaMPiS1ITBl6QmDL4kNWHwJakJgy9JTRh8SWrC\n4EtSEwZfkpow+JLUhMGXpCYMviQ1YfAlqQmDL0lNGHxJasLgS1ITBl+SmjD4ktSEwZekJgy+JDUx\nKPhJDiR5PMlSkrsuseadSc4mOZPki7MdU5K0XjunLUiyAzgG/BlwATidZKGqzk6s2Qf8HfCWqno6\nyes2amBJ0toMeYR/E7BUVeeq6jngPuDQqjXvBY5V1dMAVfXkbMeUJK3XkODvBs5PHF8Yn5t0HXBd\nku8kOZXkwMVuKMnRJItJFpeXl9c2sSRpTWb1ou1OYB9wC3AE+EySq1cvqqrjVTVfVfNzc3MzumtJ\n0hBDgv8EsHfieM/43KQLwEJV/aqqfgj8gNEPAEnSZWJI8E8D+5Jcm+QK4DCwsGrN1xg9uifJLkZP\n8Zyb4ZySpHWaGvyqeh64EzgJPAbcX1VnktyT5OB42UngqSRngQeAD1XVUxs1tCTppUtVbckdz8/P\n1+Li4pbctyS9XCV5uKrm1/K1vtNWkpow+JLUhMGXpCYMviQ1YfAlqQmDL0lNGHxJasLgS1ITBl+S\nmjD4ktSEwZekJgy+JDVh8CWpCYMvSU0YfElqwuBLUhMGX5KaMPiS1ITBl6QmDL4kNWHwJakJgy9J\nTRh8SWrC4EtSEwZfkpow+JLUhMGXpCYMviQ1YfAlqQmDL0lNGHxJasLgS1ITBl+SmjD4ktSEwZek\nJgYFP8mBJI8nWUpy14use0eSSjI/uxElSbMwNfhJdgDHgFuB/cCRJPsvsu4q4G+Ah2Y9pCRp/YY8\nwr8JWKqqc1X1HHAfcOgi6z4GfBz4xQznkyTNyJDg7wbOTxxfGJ/7P0luBPZW1ddf7IaSHE2ymGRx\neXn5JQ8rSVq7db9om+QVwCeAD05bW1XHq2q+qubn5ubWe9eSpJdgSPCfAPZOHO8Zn3vBVcCbgW8n\n+RFwM7DgC7eSdHkZEvzTwL4k1ya5AjgMLLxwZVU9U1W7quqaqroGOAUcrKrFDZlYkrQmU4NfVc8D\ndwIngceA+6vqTJJ7khzc6AElSbOxc8iiqjoBnFh17u5LrL1l/WNJkmbNd9pKUhMGX5KaMPiS1ITB\nl6QmDL4kNWHwJakJgy9JTRh8SWrC4EtSEwZfkpow+JLUhMGXpCYMviQ1YfAlqQmDL0lNGHxJasLg\nS1ITBl+SmjD4ktSEwZekJgy+JDVh8CWpCYMvSU0YfElqwuBLUhMGX5KaMPiS1ITBl6QmDL4kNWHw\nJakJgy9JTRh8SWrC4EtSEwZfkpoYFPwkB5I8nmQpyV0Xuf4DSc4meTTJN5O8cfajSpLWY2rwk+wA\njgG3AvuBI0n2r1r2CDBfVX8AfBX4h1kPKklanyGP8G8ClqrqXFU9B9wHHJpcUFUPVNWz48NTwJ7Z\njilJWq8hwd8NnJ84vjA+dyl3AN+42BVJjiZZTLK4vLw8fEpJ0rrN9EXbJLcD88C9F7u+qo5X1XxV\nzc/Nzc3yriVJU+wcsOYJYO/E8Z7xuf8nyduBDwNvrapfzmY8SdKsDHmEfxrYl+TaJFcAh4GFyQVJ\nbgA+DRysqidnP6Ykab2mBr+qngfuBE4CjwH3V9WZJPckOThedi/wauArSf49ycIlbk6StEWGPKVD\nVZ0ATqw6d/fE5bfPeC5J0oz5TltJasLgS1ITBl+SmjD4ktSEwZekJgy+JDVh8CWpCYMvSU0YfElq\nwuBLUhMGX5KaMPiS1ITBl6QmDL4kNWHwJakJgy9JTRh8SWrC4EtSEwZfkpow+JLUhMGXpCYMviQ1\nYfAlqQmDL0lNGHxJasLgS1ITBl+SmjD4ktSEwZekJgy+JDVh8CWpCYMvSU0YfElqwuBLUhMGX5Ka\nGBT8JAeSPJ5kKcldF7n+N5J8eXz9Q0mumfWgkqT1mRr8JDuAY8CtwH7gSJL9q5bdATxdVb8L/BPw\n8VkPKklanyGP8G8ClqrqXFU9B9wHHFq15hDwb+PLXwXeliSzG1OStF47B6zZDZyfOL4A/NGl1lTV\n80meAX4b+NnkoiRHgaPjw18m+f5aht6GdrFqrxpzL1a4FyvcixW/t9YvHBL8mamq48BxgCSLVTW/\nmfd/uXIvVrgXK9yLFe7FiiSLa/3aIU/pPAHsnTjeMz530TVJdgKvBZ5a61CSpNkbEvzTwL4k1ya5\nAjgMLKxaswD85fjyXwDfqqqa3ZiSpPWa+pTO+Dn5O4GTwA7gs1V1Jsk9wGJVLQD/CnwhyRLwc0Y/\nFKY5vo65txv3YoV7scK9WOFerFjzXsQH4pLUg++0laQmDL4kNbHhwfdjGVYM2IsPJDmb5NEk30zy\nxq2YczNM24uJde9IUkm27Z/kDdmLJO8cf2+cSfLFzZ5xswz4P/KGJA8keWT8/+S2rZhzoyX5bJIn\nL/VepYx8crxPjya5cdANV9WG/WP0Iu9/AL8DXAF8D9i/as1fAZ8aXz4MfHkjZ9qqfwP34k+B3xxf\nfl/nvRivuwp4EDgFzG/13Fv4fbEPeAT4rfHx67Z67i3ci+PA+8aX9wM/2uq5N2gv/gS4Efj+Ja6/\nDfgGEOBm4KEht7vRj/D9WIYVU/eiqh6oqmfHh6cYvedhOxryfQHwMUafy/SLzRxukw3Zi/cCx6rq\naYCqenKTZ9wsQ/aigNeML78W+MkmzrdpqupBRn/xeCmHgM/XyCng6iSvn3a7Gx38i30sw+5Lramq\n54EXPpZhuxmyF5PuYPQTfDuauhfjX1H3VtXXN3OwLTDk++I64Lok30lyKsmBTZtucw3Zi48Ctye5\nAJwA3r85o112XmpPgE3+aAUNk+R2YB5461bPshWSvAL4BPDuLR7lcrGT0dM6tzD6re/BJL9fVf+1\npVNtjSPA56rqH5P8MaP3/7y5qv5nqwd7OdjoR/h+LMOKIXtBkrcDHwYOVtUvN2m2zTZtL64C3gx8\nO8mPGD1HubBNX7gd8n1xAVioql9V1Q+BHzD6AbDdDNmLO4D7Aarqu8CrGH2wWjeDerLaRgffj2VY\nMXUvktwAfJpR7Lfr87QwZS+q6pmq2lVV11TVNYxezzhYVWv+0KjL2JD/I19j9OieJLsYPcVzbjOH\n3CRD9uLHwNsAkryJUfCXN3XKy8MC8K7xX+vcDDxTVT+d9kUb+pRObdzHMrzsDNyLe4FXA18Zv279\n46o6uGVDb5CBe9HCwL04Cfx5krPAfwMfqqpt91vwwL34IPCZJH/L6AXcd2/HB4hJvsToh/yu8esV\nHwFeCVBVn2L0+sVtwBLwLPCeQbe7DfdKknQRvtNWkpow+JLUhMGXpCYMviQ1YfAlqQmDL0lNGHxJ\nauJ/Acz2XLpusNoKAAAAAElFTkSuQmCC\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10a4d8668>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"fig = plt.figure()\n",
"ax = fig.add_subplot(111)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 绘制散点图"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<matplotlib.collections.PathCollection at 0x10a5270b8>"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dataset = np.array(dataset)\n",
"ax.scatter(dataset[:, 0], dataset[:, 1])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 绘制回归曲线"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[<matplotlib.lines.Line2D at 0x10a518710>]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = np.sort(np.array(dataset[:, 0]))\n",
"y = [tree_predict([1.0] + [i], tree) for i in x]\n",
"ax.plot(x, y, c='r')"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD8CAYAAABn919SAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xd809XixvHPaZtCy2qFOiggyg8XoIJVUa4DFyoyZC/F\nieO6FcQrlCUyKlec140gXKQMK7jQizguCooWRESuojKCSEEKQkPpOL8/0oSkTWmhadqkz/v1QtLk\nfJPztfBweqax1iIiIuEvqqorICIiwaFAFxGJEAp0EZEIoUAXEYkQCnQRkQihQBcRiRAKdBGRCKFA\nFxGJEAp0EZEIERPKD2vUqJFt3rx5KD9SRCTsffPNNzustUlllQtpoDdv3pyVK1eG8iNFRMKeMWZj\necqpy0VEJEIo0EVEIoQCXUQkQijQRUQiRJmBbox5zRiz3Rjzvc9zacaYH40x3xlj3jLGJFRuNUVE\npCzlaaG/DlxZ7LmPgNbW2tOB/wGPBLleIiJymMqctmit/cwY07zYcx/6fLkc6BXcaomIhI8RGWuY\nvWIzBdYSbQz9z23KY93bhLwewZiHfhMwp7QXjTFDgCEAzZo1C8LHiYhUHyMy1jBz+Sbv1wXWMnP5\nJrbNW8QP9RtjmjVjaKeT6d42udLrUqFBUWPMo0A+MKu0Mtbal6y1KdbalKSkMhc6iYiEldkrNpd4\nrtvapfxrdirDP5mGM9vFIwvWkJHprPS6HHGgG2NuAK4BBlqdNC0iNVRBsfi7+esMnnpnCl83acU/\nOt0FgCuvgLTF6yu9LkfU5WKMuRIYBlxkrc0JbpVERMJHtDHuULeWYZ9N587l83jvpPO5v8tD5MbE\nesttzXZVel3KDHRjzGzgYqCRMWYLMAr3rJZawEfGGIDl1trbK7GeIiLVTkamk5hoAwfyefyDZ+m7\n5iNmnnkVqZffTmFUtF/ZxglxlV6f8sxy6R/g6VcroS4iImEhI9PJ6IVryXblUSsvlxcWTubyn1cw\ntUN/pnYYAO6Grp+hnU6u9HqFdLdFEZFwl5Hp5JEFa3DlFVB//15enj+Os7f8wIjL72Bmu84Br0mM\nd4RklosCXUTkMKQtXo8rr4Cj/9rJ9LmjaLFzC3d3Hca7p14QsHycI5pRXVqFpG4KdBGRcsjIdJK2\neD3ObBcn/OlkRnoqia493Nh7NMuan1nqdRN6tAlJ6xwU6CIiZfLtZmm97WdenzsKgP79HmfNcS1L\nvS45IS5kYQ7abVFEpEyebpYOv63izdmPsD+mFr0HTj5kmMc5okMyEOpLLXQRkTJszXbRed3nPPnO\nFDY0bMLg3mPYXq8h4G6Fe4I7bfF6tma7aFz0XChb56BAFxEp09/XfcgDC59hZZNTuaVnKntq1wXc\nYb5s+CXecqEO8OIU6CIipbEWxozhoYVPs+Sk9tx5zVByHbWAqulSKYsCXUSkiO9MFoctZNSH/2LQ\nqvfZ2K0vex+dRKMlG6q0S6UsCnQREfxnssTm5/HkO0/Qef0ynm/fi2daD2ZCTIxf90p1pFkuIiIc\nnMlSNzeH1+eOovP6ZYy75BYmX3QDrvzCkOyWWFFqoYuI4J7J0mjfLl6fO5qTs37jvmseJKNVR7/X\nqzsFuogIkFK4i7SZwzh635/c2mMkn7RI8Xs9FLslVpQCXURk9WremPYgrv0uBvYdT2byKX4vG0Kz\nW2JFqQ9dRGq2Tz+FCy+kdu1YZkx+g1UBwnxg+2bVbkZLIGqhi0hE80xFDDjd8K23oH9/OPFEWLyY\ne5s25fhDla/mTCiPA01JSbErV64M2eeJSM3mOxXRwxFlqFs7hk5fvsP4xc+xu/WZHLX0Q2jYsApr\nemjGmG+stSlllVOXi4hELM9URF95BYX0XzKLiR88w2cntOWyq0aSsWl/FdUwuNTlIiIRxbeLpXj/\ng7GFpC55mRu/WcSCVh0ZdtW95JsY0havD5tulUNRoItIxAjUxeLhKMjjiXen0m3dp7yS0o3xl9yM\nNe5OCmcYzDEvDwW6iESMQF0sAPEHXLzw1uNc+FsmEy+6gRfO7el3kHN0gEOdw5ECXUQiRqDVnIk5\nu5k2bwxttv3M0KvuYe7pV5QoUxDCySGVSYEuImHP029ePJaTd29nRnoqyXu2c9u1j/KflucGvD45\nDFaBlocCXUTCWmn95i2zNjIjPZU6efu5rs9Yvm7aOuD1jmgTFqtAy0OBLiJhLVC/ebst63ht/hhy\nY2LpPXAS65OaB7w2Md7BqC6tImKGCyjQRSTMFe8377jha57PmMjv9Rpyfd9xbGlwjN/r0cYwpc8Z\nERPivrSwSETCmu8uiD2+X8LL88fxU6OmDL5xCjsbNfYrG+eIjtgwB7XQRSSMBNqXpXnDOJzZLm5d\nsYBHP3mNz48/k9uv/QcuRzwDzkpm6Y9ZYbkvy5HQXi4iEhYCDX5GGSgstAz/ZBq3f7WARadcwIOd\nH+BAjANwTzV/ss+ZYR/i5d3LRS10EQkLgQY/o/LzmfzBM/T6fgnT23VmzKVDKIyK9r5uLTyyYA1A\n2Id6eagPXUTCQvHl+bXz9vPiW+Pp9f0S/vm3gYy67Ha/MPdw5RWExXmgwaAWuohUexmZTgx4Fw41\ncP3Fq/PH0s75I49ecSez2l59yOvD4TzQYFCgi0i157sK9Ng9O5g+N5Xmu7ZyZ/fhfHByhzKvD4fz\nQINBgS4i1U7x2Sye7pYWOzczPT2VBvv3ckPvsXx5/OllvlecIzpiVoKWRYEuItVK8dkszmwXBjh9\n63qmzRtDgYmi34CJrD2mhfcaA/w6sbP3+nA9Qq6iFOgiUq0Ems3yt1+/5YW3HmdHnQSu7zOWjYn+\nC4Z8u1S6t02uMQFeXJmzXIwxrxljthtjvvd57ihjzEfGmJ+Kfk+s3GqKSE1RfACz6w+f8uq8sWxM\nPI6bbn6yRJjXpC6VspRn2uLrwJXFnhsOLLHWtgSWFH0tInLYMjKddJj4MScMf5cOEz+mQZzD+9oN\nKxfy9KI0vk0+hfvumMqSJ/oxte+ZJCfEYXBvezuhR5sa2yIvrswuF2vtZ8aY5sWe7gZcXPR4OvAJ\n8HAQ6yUiNUCg/nKAKCz3fzaTu7+cw+KW7RnW8xHGdDsLqNldKmU50j70Y6y1vxc93gYcc6jCIiKB\nBOovjy4sYNyHzzNg9WJmn34F/+rzEGOuPk0hXg4VHhS11lpjTKkbwhhjhgBDAJo1a1bRjxORCFJ8\n9Wet/AM8vXAynX5azjPn9eXpi68jTWFebke69P8PY8xxAEW/by+toLX2JWttirU2JSkp6Qg/TkQi\nke/hzPVy9zE9PZVOPy1n1GW3MeXC68grpMYs2w+GIw30hcDgoseDgbeDUx0RqUk8hzMn7f2TOf8e\nTjvnj9zTZSjTz+riLVNTlu0HQ5ldLsaY2bgHQBsZY7YAo4CJQLox5mZgI9CnMispIuEv0IKf5IQ4\nYn7dwBtzRtIwZzc390rl8xPa+V1XU5btB0N5Zrn0L+WlS4NcFxGJUIFms9w/ZxWdDmxl3MxhRNtC\nBvQbz+rGJeeTa455+WmlqIhUutEL15aYzdJ+43ekLRjH7tp1ub7POH5p2KTEdYPaN9OA6GFQoItI\npcrIdJLtyvN77sr1y3hqURq/JTbm+j5j+aNeoxLXTe0b/icNhZoOuBCRSlV8lsqAVe/zfMZE1hzb\nkj4DJgUM8+SEOIX5EVALXUQqlXeWirXc88WbPPDfWSxpcTZ/7/Yw+x21S5TX3ixHToEuIkGVkelk\nzKK17Mo52M0SVVjAqCUvMfjbd5nX+lKGX3k3+dEH4yc5Ia5GbncbbAp0EQmajEwnQ+etJq/g4OLx\n2Pw8/vnuP7nmx8954ZweTLz4RvBZUJScEMey4ZdURXUjjgJdRIImbfF6vzCvk5vDi2+N528bVzP+\n4pt4+dwefuXVvRJcCnQRCRrfVZ0N92Uzbd5oTvvjFx7ofD8LWvsvXUlW90rQKdBF5IgVX/0ZHxvN\nvgMFNMnexoz0VI77aye39hzJ0hZnl7hW3SzBp0AXkSNS2l7mJ2f9xoz0VGrlH2Bg38f4tsmpJa5N\n8DnEQoJH89BF5IgEWv2ZsmUt6bMephBD7wGTAoa5I8owumurUFWzRlELXUQOS0amk9EL15ZY/XnZ\nTyt4duEknPWP5rq+Y9la/+gS16rfvHIp0EWk3Ip3s3j0/u4jJnzwDN8f24Ibe41mV3yDEtf+NrFz\nqKpZYynQRaTcShwZZy23r5jP8E9f57Pmbbn92n+QE1tyu9tkbYEbEgp0ESk332mJxhby6MevcsvK\nt3n71It4qPN9xMbVJs7iF/qaax46CnQRKbfGCXE4s13EFOQz+f2n6LF2KdPO6sLYS2+ldqyD8de2\nAShxkIX6zENDgS4iZRr48pcs2/AnAHEH9vOvjAlc/Os3TL7wep5v35vEOrGM6tLKG9wK8KqhQBeR\nQ/IN8wTXHqbNHcPp237i4Svv5r8XdWeqWuDVhgJdRPx4Vn86s11EG+M9yPm4PVnMSE+lWfY27uj+\nCB+edB4m2+Xd71yhXvUU6CLiVXxaoifMW+zYzBvpI6mbm8P1fcayopm7r9ziXiH6yII1gEK9qmml\nqIh4lZiWCLR1/si8WcNwFObTd+BEb5j7cuUVlDiZSEJPgS4iXr7TEgEu3rCSWXMeZXftuvQY9ATr\njj6x3NdK6CnQRcSrsc8CoG5rl/LygnH8clQTeg2azOaEY8t9rVQNBbqIAO7+85wD+QDc/HUGT70z\nha+btKJf/wnsqJPoLZcQ5yDOEe13rRYPVQ8aFBWRg4OhB/J5+NPp3LFiHu+ddD73d3mI3JhYb7k4\nR7R3p0QtHqp+FOgiQtri9RzIPcCkD56l75qPmHnmVaRefjv169SmUa2YgMGtAK9+FOgiNZhnzvmO\nrGxeWDiZy39ewdQO/ZnaYQAYw25XHqtGXVHV1ZRyUqCL1EC+e5rX37+XN+aPJWXLOkZcfgcz2x3c\n5jYhXicLhRMFukgNkpHpZMyitezKcR9OcfRfO5k+dxQtdm7h7q7DePfUC/zKF60rkjChQBepIYqv\nAj3hTycz0lNJdO3hxt6jWdb8zBLX7C52KpFUbwp0kRrCdxVo620/8/rcUQD07/c4a45rGfAazS0P\nL5qHLlJDOItWcnb4bRVvzn6E/TG16D1wcqlhrrnl4UctdJEaICPTiQGuXvc5T74zhQ0NmzC49xi2\n12sYsHxivMNvf3MJDwp0kRogbfF6Bn37DmM+epGVTU7llp6p7Kldt0S5ZC0SCmsKdJFIZy19Fr3C\nvV/M5qP/O5e7ug4j11GrRLHfJnYOcLGEEwW6SCQrKIC//517v5jNnDaX848r76IgKrpEsWQNfkYE\nBbpIhPGu/tyxmxcWP0nH7z/n+fa9mHzhYDCmRHkNfkaOCgW6MeZ+4BbcB5esAW601u4PRsVE5NA8\nwe3ZZ6XjKUm8s/p3sl151M3NYdqCxzh/03dMuHwIczr0hABzyqONYUKPNuozjxBHPG3RGJMM3AOk\nWGtbA9FAv2BVTERK51kk5Mx2eY+Bm7l8E9muPBrt28Wbsx/h7C1rufeaB3mxXVeMIeCWt1P6nKEw\njyAVnYceA8QZY2KAeGBrxaskImUJdFQcQNPsbcybOYwT/9zCrT1G8narjgBk5+QxoUcbkhPiMLj7\nzNUyjzxH3OVirXUaY54ANgEu4ENr7YdBq5mIlCrQcW+nbv+F6emjcBTkM7DveDKTT/G+1jghju5t\nkxXgEa4iXS6JQDfgBKAxUMcYMyhAuSHGmJXGmJVZWVlHXlMR8Sq+JP/cTWuYM2s4+VHR9Bo42S/M\nNehZc1Sky+Uy4FdrbZa1Ng9YAJxfvJC19iVrbYq1NiUpKakCHyciHs0bHgz0Tv/7ghnpqfxRryE9\nB6WxoVFT72uJ8Q51rdQgFZnlsglob4yJx93lcimwMii1EpGAMjKdPPrWGvYdcPef9129mMcXP8fq\n41pyU69RZMfV95ZNjHeQmarDKWqSivShrzDGzAO+BfKBTOClYFVMRPz5bX9rLX//Mp2hn7/B0hPP\n4s5uj+CKre0tG+eIZlSXVlVYW6kKFZqHbq0dBYwKUl1Earzic8t991XxzGwxtpDUJS9z4zeLWNCq\nI8Ouupf86IN/lTW3vOYyNoRHkqSkpNiVK9UrIxJI8QMoABxRhrq1Y7wnDDkK8pjy7pN0XfcZL5/d\nncc73oQ1B4fC4hzRCvMIZIz5xlqbUlY5Lf0XqSYCzS3PK7TeMI8/4OKFtx7nwt8ymXDxDbx4Tk+/\npfxxjiiFeQ2nQBepJgLNLfdIzNnNtHljaLPtZ4ZedQ9zTz842GmAge2b8Vj3NiGopVRnCnSRaqJx\nQpz3VCFfybu3MyM9leQ927nt2kf5T8tzva9N7XumWuTipSPoRKqJoZ1OLrHfSsusjcybOZSkfbu4\nrs9YvzBPLlr9KeKhFrpINeE7m8WZ7aLdlnW8Nn8MuTGx9B44ifVJzb1ltfpTAlELXaQa6d42mWXD\nL6HXttXMmjOCP+Pq03NQml+Ya2MtKY1a6CLVzfTpTJ6Zyg9Hn8jgnqPYWScB0JREKZsCXaQKFV9I\nNHLdO1w540m+OP5Mhg0YRWHteExOXolFRiKBKNBFqojvQiJjC7n+ree48qsFLDrlAh7s/AAHCh3E\n5RXypGaySDmpD12kCmRkOnkwfTWuvAJiCvJJe+8pbvtqAdPbdebeLg9xIMYBgCuvgLTF66u4thIu\n1EIXCbERGWuYtXwTFqidt5/n3p7EpRu+5p9/G8jT5/crcZDzoRYcifhSoIuEUEamk5nLNwHQwPUX\nr84fSzvnjzx6xZ3Mant1wGuKH2YhUhoFukgIeAY/PStBj92zg+lzU2m+ayt3dh/OByd3CHid5pvL\n4VCgi1QS3xA3gGdf0xY7NzM9PZUG+/dyQ++xfHn86X7Xecoma2aLHCYFukglKL4VrifMz9i6nmnz\nxlBgoug3YCJrj2nhd51CXCpCgS5SCQJthXvBr9/ywluPs6NOAtf3GcvGxMbe1+rERrN27JWhrqZE\nGAW6SCUoPjOl6w+fMuXdf/JTo2YM7j2WrLqJ3tcc0Ybx12rrW6k4zUMXqQQJ8Q7v4xtWLuTpRWl8\nk3wqfQdM9Avz5IQ40nqdoS4WCQq10EUqgbXu/zz0+Rvc9WU6i1u2556uw8iNifWWiTaGZcMvqbpK\nSsRRoIsEWUamk7/27WfC4ufo/92HzD79CkZ0+jsFUf57nReE8DxfqRkU6CIVUHxzrY6nJLFoxa88\nnzGBTj8t55nz+jLlgkElVn+Cu7tFJJgU6CJHqPjURGe2i4WfruOl+eNov/l7Rl12G9PP6hLwWkeU\n0YIhCToFusgRKj41MWnvn0yfO4r/27GZe7oMZeFpFwW8LiHOweiurTQQKkGnQBc5Qr5TE4/ftZU3\n5oykYc5ubu6VyucntPMrm5wQpwFQqXSatihyhDybZrXa9jPzZg6j7gEXA/qN57/Fwlz7sUioKNBF\nDlNGppMOEz/Gme3i/I2reXP2I+TGOOg1cDL/O/40BrZvRnJCHAad/ymhpS4XkcPgOxB61Y//Zeo7\nT/BbYmOu7zOWmKZNmaB9WKQKKdBFDoNnIHRg5nuM+/BffJt8Cjf3TGVPXD2eVJhLFVOXi0g5eLtZ\nduVwz7LZjP/weZa2SGFQ33HsjquHBR0VJ1VOLXSRMni6WXJzDzBmyUsM/vZd5rW+lOFX3k1+9MG/\nQjoqTqqaAl2kDKMXrqXAtZ+n3/0n1/z4OS+c04OJF99YYvWnjoqTqqZAFylFRqaT0QvXkpe9m9fe\nGs/fNq5m/MU38fK5PUqU1dREqQ4U6CIBjMhYw6zlmzhqXzYz5o3mtD9+4YHO97Og9aUlykYbo6mJ\nUi0o0EV8ZGQ6GbNoLbty8miSvY0Z6akc99dObu05kqUtzi5RPs4RrTCXakOBLjWab4D7OmX7r0yf\nO4pa+QcY2Pcxvm1yaolrdf6nVDcKdKmxMjKdDJ23mrwC/33Jz978Pa/OH8c+R216D5jET0nH+72u\nVrlUVxUKdGNMAvAK0Br3weY3WWu/DEbFRCqD7/7lUcaUOGTisp9W8OzCSTjrH811fceytf7Rfq/X\niY1m/LUKc6meKtpCfwr4wFrbyxgTC8QHoU4ilaL4/uXFw7z3dx8y8YNnWXNsC27sNZpd8Q1KvEdC\nfKzCXKqtIw50Y0wD4ELgBgBr7QHgQHCqJRJ8xfcv97KW21fMZ/inr/NZ87bcfu0/yIkNPKdci4ek\nOqtIC/0EIAuYZow5A/gGuNdauy8oNRMJskBhbGwhj378KresfJu3T72IhzrfR160o9T30OIhqc4q\nEugxQDvgbmvtCmPMU8BwYKRvIWPMEGAIQLNmzSrwcSLl5+krd2a7iC7qK48u1mceU5DP5Pefosfa\npUw7qwtjL70Va0rf3kiLh6S6q8jmXFuALdbaFUVfz8Md8H6stS9Za1OstSlJSUkV+DiR8vH0lTuL\nWuSeEPcN87gD+3ll/jh6rF3KlIuuZ8ylQ0qEuSPakBDn0L7mEjaOuIVurd1mjNlsjDnZWrseuBT4\nIXhVEzkypfaVF2m4/y9eSR/N6dt+InPkZFpcO4DkYq15zTGXcFTRWS53A7OKZrj8AtxY8SqJVMyh\nBi6P25PFjPRUWu7dDgvm07Z7d9qCglsiQoUC3Vq7CkgJUl1EgqJxQpy3u8XX/+3YxIz0VOodyOHz\nZ2dyQffuVVA7kcqjAy4k4gztdDJxjmi/59o51zF31sPEFBbQZ8BEhmysS0ams4pqKFI5FOgScbq3\nTWZCjzYkF00xvHjDSma9OYLdtevSc1Aa644+EVdegU4YkoijQJeIdu3apby8YBwbGjah16DJbE44\n1vuaFglJpNHmXBJxPNMWB3wxn5Efv8IXzU5nSI8R7K3lvzOFFglJpFGgS0QYkbGG2Ss2u+eaW8vD\nn07njhXzeO+k87m/y0PkxsT6ldciIYlECnQJeyMy1jBz+SYAogsLePyDZ+m75iNmnnkVqZffTmHU\nwQFSg7tlrjnmEokU6BL2Zq/YDECtvFyeXTiZy39ewdQO/ZnaYYDfQc7JCXEsG35JVVVTpNIp0CXs\nFVhL/f17eWX+WFK2rGPE5Xcws11nvzLqYpGaQIEuYe+4vX8ybc5ITvzTyd1dh/HuqRd4X1MXi9Qk\nCnQJWxmZTt6c+R/mvjGMhP1/cUPv0XzR/Ezv64PaN+Ox7m2qsIYioaVAl7CUkelk5rPzeXH2SKwx\n9Os/ge+P/T8Aoo2h/7lNFeZS4yjQJSwtfe7fvD5zFNm163Fd33H8epS7O0UDn1KTaaWohJ/0dNKm\n/YPNDY6hx6A0b5iDVn9KzaZAl/Dy3HPQrx/rmp1C3wET2V6vod/LWv0pNZkCXcKDtfw45H646y4+\nanEOQwY+jqtOPb8impooNZ360KXa8pwLuu3PvUxa+iK9Vr7HnDaX848r76IgPwpHFCTGO8jOydPU\nRBEU6FJNeMJ7a7aLxglxdDwliTlfbyYqN5dnFj3B1f/7gufb92LyhYO9qz/zCi3xsTFkpl5RxbUX\nqR4U6FLlPLsjes4BdWa7mLl8E3Vzc3h5wTjO27SGsZfcymtndytxrQZBRQ5SoEvIFW+N78vNL3Go\nc6N9u3h97mhOzvqNe695kLdbdQz4XhoEFTlIgS4hFag1XlzT7G28MWckR+/7k1t6pvLpiWcFfC8N\ngor4U6BLSKUtXl+iNe7rtD9+YfrcVGIKChjYdzyZyaeUKKP9WUQCU6BLSB2qz/vcTWt4ef449taK\np1+/CWxo1LREmYQ4B6tGaRBUJBDNQ5eQKq3Pu9P/vmBGeip/1GtIz0FpAcPcEWUY3bVVZVdRJGwp\n0CWkOp6SVOK5vqsX83zGRNYecyK9B07i9/olyyTEOUjrfYa6WEQOQV0uElJLf8w6+IW1/P3LdIZ+\n/gZLTzyLO7s9giu2NgBRBqxVX7nI4VCgS0h5+tCNLSR1ycvc+M0iFrTqyLCr7iU/+uAfx/q11Vcu\ncrgU6BJSjRPi2L5zD1PefZKu6z7j5bO783jHm7DGv/dvtyuvimooEr4U6BJSwy9oQuL1/fnbL98y\n4eIbePGcnn4HOXtowZDI4dOgqIRERqaTziMW0Kz3NZz36yoeveY+Xjy3V8Aw14IhkSOjFrpUuhEZ\na/hk8Uqmp48keU8Wt137KP9pea5fGQNY3CcOaRBU5Mgo0CXofPdqaRDnIGnTz8xLH0l8Xi7X9RnL\n101bl7jGE+Y6Pk7kyCnQJagyMp0MnbuavEILwIk/fcdr88eQGxNL74GTWJ/UvNRrtXOiSMUo0CUo\nPK1y3822Om74muczJvJ7vYZc33ccWxocc8j30ECoSMUo0KXCRmSsYdbyTVif53quWcKk95/ih2NO\n5MZeo9lZJ+GQ76GBUJGK0ywXqZCMTGeJMB+yYj5T3nuSL5udTv9+j5cIcwN0aHEUyQlxGNx95xN6\ntNFAqEgFqYUuFZK2eL03zI0tZPgnr3PbVwtYdMoFPNj5AQ7EOAAt5RcJBQW6VIhnIDOmIJ9JHzxN\nz+8/5vV21zDmsiHe1Z+OaENaL22sJVLZKhzoxphoYCXgtNZeU/EqSThpnBDHzqxdPPf2JC7d8DVT\n/jaQZ87v510wlBjvYFSXVgpzkRAIRgv9XmAdUD8I71WjZWQ6Gb1wLdlF+5hURhgWP8+zIt0fGZlO\nCnfsYNabo2i7dT2PXnEns9pe7VcmPjZGYS4SIhUaFDXGNAE6A68Epzo1l2f+drbPplS7cvK4b84q\nRmSsCdpnPLJgDc5sFxb3eZ6PLFhDRqbziN5r6usfM/31obT+42fu7D68RJiD5paLhFJFZ7lMBYYB\nhaUVMMYMMcasNMaszMrKKq1YjZe2eL13MU5xs5ZvOqLQDfQZxc/zdOUVkLZ4/WG/1/TX3mfWtAc5\n7q8sbug9lg9O7hCwnOaWi4TOEQe6MeYaYLu19ptDlbPWvmStTbHWpiQllTyJRtwO1ZK1cEShW97P\ncGa7OGH4u3SY+HG5/uH45I13ePXVB4ktyKPfgIl8efzpActpbrlIaFWkD70D0NUYczVQG6hvjJlp\nrR0UnKr643+gAAAJrklEQVTVDJ4+7cBt84NKC+Pi+6YYA9k5eX6PPX3ljRPi/FZy+vLtggFK7/f+\n8EPOuaU3WfEJXN9nLBsTGwcspsFQkdAz1pYVJeV4E2MuBh4qa5ZLSkqKXblyZYU/r7or78Cjp0+7\neDdIIAlxDurUivF7T8Bv35RD8exmWB6eTbIyMp2MWbSWXTnufv1+P/+Xx99+gh+Pasrg3mPJqpsY\n8Pqpfc9UkIsEkTHmG2ttSlnlNA89yIqH9KFavYH6tANxRBn2Hcj3Dpg6s13cN2fVIUPa2EISXX+R\ntG8XjfZl06jo96Sc7KKvs2mU436+Xm4O6adfzuQLB+OKrc3WbJd7kHbeavIK3J9ww8qFjF7yEiua\nteGhgWPIKowN+LnJCXEKc5EqEpQWenlVpxZ6MKfv+eow8eNSuzUSinWDlFbOV7wjilqOaHbl5BFV\nWMBRrj3eQPaGdVEwJ3kfZ3NUzm5ibMmx6gNRMeyok+D+Fe/+vVZ+Ht3WfcqmBsfw8FX38uOpZ7Hb\nlUehBazloc/f4K4v0/ngpPO4t8tQ4urVYd+BfG/YeziiDGm9tYBIJNjK20KvkYEeaDMp8O/3LSvw\nfXcXjDaGAmtJLmdI+4ouLOConN3uMN63yxvIJQM7m6Nce4gOENK50Q6y/EI6kR11Esiqk+gNbc/X\ne2rVCXhK0Dmbv2fS+09xwq7fmdX2KiZcdCMuRy0eW/wc/b/7kH+f0YkRV9xJYVQ0Bniy75l+3TEJ\ncQ5Gd1WfuUhliMhA9w3RKAOeruOywsQ3nBPiHd4QCiTOEU3Ps5KZ/43TrzskzhHt3UCqtH8QwN1X\nHV2QT8McTyhn+4T0wcD2hHWi6y+iAryTK6aWXys6y+fxjjqJRQHuDu6/YuMDhvThqp23n4eW/Zsb\nv8pgW92GbGjYhAt/y+SZ8/oy5YJB3s/QQRQioRVxgV6eAcTEeIffrA5PS7u8A48enha3L0dBHq1i\ncrmiEXy1fB1Jni4Ov8B2Pz7KtSfg++Y4apVoNe+ITzzYui4K7aw6ieyLjfML6ThH9GHdQ0WcuXU9\nae9NpeXOzYy67Damn9XF+5q6VURCL+IC/VB904F4WtTFD13wFZufR6OcXQfDuCiQD4b1wS6PhP17\nA77H3ti4otZzYom+6R11EsjyeT4n9sgW2XjO2fTdFiCQxHj3zoaH+gkE8PvppjSx+Xkc91eW37RE\nT1eLwlwktCJulkt5l5DXysv1zuR4b/wXXLC3WL+0zwBi/dx9Ad9jT2y8N4TXJx3Psjpn+PVNu1vR\n7tf3O2oH8zYD8vy04fmJ48H01SV+ggD3vilDO51c4icSR7ShTmwMu10Hf3op7R+6xHgHe3PzOYDD\nL8zVMhep/sIm0D2zQi75+SuO3bvTp1XtH9j1D+QEvH53rTreQF6XdAKfNy8aNCw+mBjfgFxHrRDf\nXekGtW/mF6Ld2yZz/5xVActuzXZ5y5ZnBk/x4I9zRDOqSysADXiKhKGwCXRPy3P84uc4bu9OAHbV\nrucN5LXHtPCb2ZFV92Bf9c74BO9BC5UpUN97cWUt8Ik2hkJrDxnEpU159Oyb4mnNH0pZwa/wFgk/\nYRPo3dsms3Ljnwzu9xi7Y+P4M74BedGVH9LFFQ9kAwxs34zHurcpcwDWM4Nm6Y9ZOLNdJd7LdybN\noQTqVjmSfVPKE/wiEj7CJtAzMp3uqYQNm4bk8xzRBix+y+p9A/lQrdpAe6scai774S5uOpxuFRGp\nOcJ+lkugPU7uK6WPGdxBXXyFo+d9iocvKDRFpOrVmFkuu115rBp1hd9zpc3gSPaZ4VHekFaAi0i4\nCJtAL2sg0Neh+pjVbywikaqiJxaFzNBOJxPniPZ7rrSBwO5tk5nQow3JCXEY3C3z8gw2ioiEs7Bp\noR/uQKBa4iJS04RNoINCWkTkUMKmy0VERA5NgS4iEiEU6CIiEUKBLiISIRToIiIRIqRL/40xWcDG\nCr5NI2BHEKoTLnS/kU33G9mCdb/HW2uTyioU0kAPBmPMyvLsaRApdL+RTfcb2UJ9v+pyERGJEAp0\nEZEIEY6B/lJVVyDEdL+RTfcb2UJ6v2HXhy4iIoGFYwtdREQCqLaBboy50hiz3hjzszFmeIDXaxlj\n5hS9vsIY0zz0tQyectzvA8aYH4wx3xljlhhjjq+KegZLWffrU66nMcYaY8J6ZkR57tcY06foe7zW\nGPPvUNcxmMrx57mZMWapMSaz6M/01VVRz2AwxrxmjNlujPm+lNeNMebpov8X3xlj2lVaZay11e4X\nEA1sAE4EYoHVwGnFytwJvFD0uB8wp6rrXcn32xGIL3p8R6Tfb1G5esBnwHIgparrXcnf35ZAJpBY\n9PXRVV3vSr7fl4A7ih6fBvxW1fWuwP1eCLQDvi/l9auB93GfKd8eWFFZdamuLfRzgJ+ttb9Yaw8A\nbwLdipXpBkwvejwPuNQYY0JYx2Aq836ttUuttTlFXy4HmoS4jsFUnu8vwDhgErA/lJWrBOW531uB\n56y1uwCstdtDXMdgKs/9WqB+0eMGwNYQ1i+orLWfAX8eokg3YIZ1Ww4kGGOOq4y6VNdATwY2+3y9\npei5gGWstfnAbqBhSGoXfOW5X1834/4XP1yVeb9FP5Y2tda+G8qKVZLyfH9PAk4yxiwzxiw3xlwZ\nstoFX3nudzQwyBizBXgPuDs0VasSh/v3+4iF1QEXAsaYQUAKcFFV16WyGGOigH8CN1RxVUIpBne3\ny8W4f/r6zBjTxlqbXaW1qjz9gdettVOMMecBbxhjWltrC6u6YuGsurbQnUBTn6+bFD0XsIwxJgb3\nj207Q1K74CvP/WKMuQx4FOhqrc0NUd0qQ1n3Ww9oDXxijPkNd7/jwjAeGC3P93cLsNBam2et/RX4\nH+6AD0flud+bgXQAa+2XQG3c+55EonL9/Q6G6hroXwMtjTEnGGNicQ96LixWZiEwuOhxL+BjWzQC\nEYbKvF9jTFvgRdxhHs79q1DG/Vprd1trG1lrm1trm+MeM+hqrV1ZNdWtsPL8ec7A3TrHGNMIdxfM\nL6GsZBCV5343AZcCGGNOxR3oWSGtZegsBK4vmu3SHthtrf29Uj6pqkeIDzFyfDXuVsoG4NGi58bi\n/osN7j8Ac4Gfga+AE6u6zpV8v/8B/gBWFf1aWNV1rsz7LVb2E8J4lks5v78GdzfTD8AaoF9V17mS\n7/c0YBnuGTCrgCuqus4VuNfZwO9AHu6ftG4Gbgdu9/nePlf0/2JNZf5Z1kpREZEIUV27XERE5DAp\n0EVEIoQCXUQkQijQRUQihAJdRCRCKNBFRCKEAl1EJEIo0EVEIsT/A9s/vksORoTUAAAAAElFTkSu\nQmCC\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x10a4d8668>"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"fig"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
================================================
FILE: classification_and_regression_trees/prune.py
================================================
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from regression_tree import *
def not_tree(tree):
''' 判断是否不是一棵树结构
'''
return type(tree) is not dict
def collapse(tree):
''' 对一棵树进行塌陷处理, 得到给定树结构的平均值
'''
if not_tree(tree):
return tree
ltree, rtree = tree['left'], tree['right']
return (collapse(ltree) + collapse(rtree))/2
def postprune(tree, test_data):
''' 根据测试数据对树结构进行后剪枝
'''
if not_tree(tree):
return tree
# 若没有测试数据则直接返回树平均值
if not test_data:
return collapse(tree)
ltree, rtree = tree['left'], tree['right']
if not_tree(ltree) and not_tree(rtree):
# 分割数据用于测试
ldata, rdata = split_dataset(test_data, tree['feat_idx'], tree['feat_val'])
# 分别计算合并前和合并后的测试数据误差
err_no_merge = (np.sum((np.array(ldata) - ltree)**2) +
np.sum((np.array(rdata) - rtree)**2))
err_merge = np.sum((np.array(test_data) - (ltree + rtree)/2)**2)
if err_merge < err_no_merge:
print('merged')
return (ltree + rtree)/2
else:
return tree
tree['left'] = postprune(tree['left'], test_data)
tree['right'] = postprune(tree['right'], test_data)
return tree
================================================
FILE: classification_and_regression_trees/regression_tree.py
================================================
#!/usr/bin/env python
# -*- coding: utf-8 -*-
''' 回归树实现
'''
import uuid
from functools import namedtuple
import numpy as np
import matplotlib.pyplot as plt
def load_data(filename):
''' 加载文本文件中的数据.
'''
dataset = []
with open(filename, 'r') as f:
for line in f:
line_data = [float(data) for data in line.split()]
dataset.append(line_data)
return dataset
def split_dataset(dataset, feat_idx, value):
''' 根据给定的特征编号和特征值对数据集进行分割
'''
ldata, rdata = [], []
for data in dataset:
if data[feat_idx] < value:
ldata.append(data)
else:
rdata.append(data)
return ldata, rdata
def create_tree(dataset, fleaf, ferr, opt=None):
''' 递归创建树结构
dataset: 待划分的数据集
fleaf: 创建叶子节点的函数
ferr: 计算数据误差的函数
opt: 回归树参数.
err_tolerance: 最小误差下降值;
n_tolerance: 数据切分最小样本数
'''
if opt is None:
opt = {'err_tolerance': 1, 'n_tolerance': 4}
# 选择最优化分特征和特征值
feat_idx, value = choose_best_feature(dataset, fleaf, ferr, opt)
# 触底条件
if feat_idx is None:
return value
# 创建回归树
tree = {'feat_idx': feat_idx, 'feat_val': value}
# 递归创建左子树和右子树
ldata, rdata = split_dataset(dataset, feat_idx, value)
ltree = create_tree(ldata, fleaf, ferr, opt)
rtree = create_tree(rdata, fleaf, ferr, opt)
tree['left'] = ltree
tree['right'] = rtree
return tree
def fleaf(dataset):
''' 计算给定数据的叶节点数值, 这里为均值
'''
dataset = np.array(dataset)
return np.mean(dataset[:, -1])
def ferr(dataset):
''' 计算数据集的误差.
'''
dataset = np.array(dataset)
m, _ = dataset.shape
return np.var(dataset[:, -1])*dataset.shape[0]
def choose_best_feature(dataset, fleaf, ferr, opt):
''' 选取最佳分割特征和特征值
dataset: 待划分的数据集
fleaf: 创建叶子节点的函数
ferr: 计算数据误差的函数
opt: 回归树参数.
err_tolerance: 最小误差下降值;
n_tolerance: 数据切分最小样本数
'''
dataset = np.array(dataset)
m, n = dataset.shape
err_tolerance, n_tolerance = opt['err_tolerance'], opt['n_tolerance']
err = ferr(dataset)
best_feat_idx, best_feat_val, best_err = 0, 0, float('inf')
# 遍历所有特征
for feat_idx in range(n-1):
values = dataset[:, feat_idx]
# 遍历所有特征值
for val in values:
# 按照当前特征和特征值分割数据
ldata, rdata = split_dataset(dataset.tolist(), feat_idx, val)
if len(ldata) < n_tolerance or len(rdata) < n_tolerance:
# 如果切分的样本量太小
continue
# 计算误差
new_err = ferr(ldata) + ferr(rdata)
if new_err < best_err:
best_feat_idx = feat_idx
best_feat_val = val
best_err = new_err
# 如果误差变化并不大归为一类
if abs(err - best_err) < err_tolerance:
return None, fleaf(dataset)
# 检查分割样本量是不是太小
ldata, rdata = split_dataset(dataset.tolist(), best_feat_idx, best_feat_val)
if len(ldata) < n_tolerance or len(rdata) < n_tolerance:
return None, fleaf(dataset)
return best_feat_idx, best_feat_val
def get_nodes_edges(tree, root_node=None):
''' 返回树中所有节点和边
'''
Node = namedtuple('Node', ['id', 'label'])
Edge = namedtuple('Edge', ['start', 'end'])
nodes, edges = [], []
if type(tree) is not dict:
return nodes, edges
if root_node is None:
label = '{}: {}'.format(tree['feat_idx'], tree['feat_val'])
root_node = Node._make([uuid.uuid4(), label])
nodes.append(root_node)
for sub_tree in (tree['left'], tree['right']):
if type(sub_tree) is dict:
node_label = '{}: {}'.format(sub_tree['feat_idx'], sub_tree['feat_val'])
else:
node_label = '{:.2f}'.format(sub_tree)
sub_node = Node._make([uuid.uuid4(), node_label])
nodes.append(sub_node)
edge = Edge._make([root_node, sub_node])
edges.append(edge)
sub_nodes, sub_edges = get_nodes_edges(sub_tree, root_node=sub_node)
nodes.extend(sub_nodes)
edges.extend(sub_edges)
return nodes, edges
def dotify(tree):
''' 获取树的Graphviz Dot文件的内容
'''
content = 'digraph decision_tree {\n'
nodes, edges = get_nodes_edges(tree)
for node in nodes:
content += ' "{}" [label="{}"];\n'.format(node.id, node.label)
for edge in edges:
start, end = edge.start, edge.end
content += ' "{}" -> "{}";\n'.format(start.id, end.id)
content += '}'
return content
def tree_predict(data, tree):
''' 根据给定的回归树预测数据值
'''
if type(tree) is not dict:
return tree
feat_idx, feat_val = tree['feat_idx'], tree['feat_val']
if data[feat_idx] < feat_val:
sub_tree = tree['left']
else:
sub_tree = tree['right']
return tree_predict(data, sub_tree)
if '__main__' == __name__:
datafile = 'ex0.txt'
dataset = load_data(datafile)
tree = create_tree(dataset, fleaf, ferr, opt={'n_tolerance': 4,
'err_tolerance': 1})
dotfile = '{}.dot'.format(datafile.split('.')[0])
with open(dotfile, 'w') as f:
content = dotify(tree)
f.write(content)
dataset = np.array(dataset)
# 绘制散点
plt.scatter(dataset[:, 0], dataset[:, 1])
# 绘制回归曲线
x = np.linspace(0, 1, 50)
y = [tree_predict([i], tree) for i in x]
plt.plot(x, y, c='r')
plt.show()
================================================
FILE: decision_tree/english_big.txt
================================================
Urgent! call 09061749602 from Landline. Your complimentary 4* Tenerife Holiday or 10,000 cash await collection SAE T&Cs BOX 528 HP20 1YF 150ppm 18+,spam
+449071512431 URGENT! This is the 2nd attempt to contact U!U have WON 1250 CALL 09071512433 b4 050703 T&CsBCM4235WC1N3XX. callcost 150ppm mobilesvary. max7. 50,spam
FREE for 1st week! No1 Nokia tone 4 ur mob every week just txt NOKIA to 8007 Get txting and tell ur mates www.getzed.co.uk POBox 36504 W45WQ norm150p/tone 16+,spam
Urgent! call 09066612661 from landline. Your complementary 4* Tenerife Holiday or 10,000 cash await collection SAE T&Cs PO Box 3 WA14 2PX 150ppm 18+ Sender: Hol Offer,spam
WINNER!! As a valued network customer you have been selected to receivea 900 prize reward! To claim call 09061701461. Claim code KL341. Valid 12 hours only.,spam
okmail: Dear Dave this is your final notice to collect your 4* Tenerife Holiday or #5000 CASH award! Call 09061743806 from landline. TCs SAE Box326 CW25WX 150ppm,spam
07732584351 - Rodger Burns - MSG = We tried to call you re your reply to our sms for a free nokia mobile + free camcorder. Please call now 08000930705 for delivery tomorrow,spam
"URGENT! This is the 2nd attempt to contact U!U have WON 1000CALL 09071512432 b4 300603t&csBCM4235WC1N3XX.callcost150ppmmobilesvary. max7. 50",spam
Congrats! Nokia 3650 video camera phone is your Call 09066382422 Calls cost 150ppm Ave call 3mins vary from mobiles 16+ Close 300603 post BCM4284 Ldn WC1N3XX,spam
Urgent! Please call 0906346330. Your ABTA complimentary 4* Spanish Holiday or 10,000 cash await collection SAE T&Cs BOX 47 PO19 2EZ 150ppm 18+,spam
Congrats 2 mobile 3G Videophones R yours. call 09063458130 now! videochat wid ur mates, play java games, Dload polypH music, noline rentl. bx420. ip4. 5we. 150p,spam
Dear 0776xxxxxxx U've been invited to XCHAT. This is our final attempt to contact u! Txt CHAT to 86688 150p/MsgrcvdHG/Suite342/2Lands/Row/W1J6HL LDN 18yrs,spam
Win the newest Harry Potter and the Order of the Phoenix (Book 5) reply HARRY, answer 5 questions - chance to be the first among readers!,spam
SMS AUCTION - A BRAND NEW Nokia 7250 is up 4 auction today! Auction is FREE 2 join & take part! Txt NOKIA to 86021 now!,spam
09066362231 URGENT! Your mobile No 07xxxxxxxxx won a 2,000 bonus caller prize on 02/06/03! this is the 2nd attempt to reach YOU! call 09066362231 ASAP!,spam
Dear U've been invited to XCHAT. This is our final attempt to contact u! Txt CHAT to 86688,spam
449050000301 You have won a 2,000 price! To claim, call 09050000301.,spam
YOU ARE CHOSEN TO RECEIVE A 350 AWARD! Pls call claim number 09066364311 to collect your award which you are selected to receive as a valued mobile customer.,spam
44 7732584351, Do you want a New Nokia 3510i colour phone DeliveredTomorrow? With 300 free minutes to any mobile + 100 free texts + Free Camcorder reply or call 08000930705.,spam
URGENT! Your mobile was awarded a 1,500 Bonus Caller Prize on 27/6/03. Our final attempt 2 contact U! Call 08714714011,spam
Congrats! 2 mobile 3G Videophones R yours. call 09063458130 now! videochat wid your mates, play java games, Dload polyPH music, noline rentl.,spam
Wan2 win a Meet+Greet with Westlife 4 U or a m8? They are currently on what tour? 1)Unbreakable, 2)Untamed, 3)Unkempt. Text 1,2 or 3 to 83049. Cost 50p +std text,spam
URGENT This is our 2nd attempt to contact U. Your 900 prize from YESTERDAY is still awaiting collection. To claim CALL NOW 09061702893,spam
Want explicit SEX in 30 secs? Ring 02073162414 now! Costs 20p/min,spam
Sorry I missed your call let's talk when you have the time. I'm on 07090201529,spam
Congratulations YOU'VE Won. You're a Winner in our August 1000 Prize Draw. Call 09066660100 NOW. Prize Code 2309.,spam
Fantasy Football is back on your TV. Go to Sky Gamestar on Sky Active and play 250k Dream Team. Scoring starts on Saturday, so register now!SKY OPT OUT to 88088,spam
87077: Kick off a new season with 2wks FREE goals & news to ur mobile! Txt ur club name to 87077 eg VILLA to 87077,spam
This is the 2nd attempt to contract U, you have won this weeks top prize of either 1000 cash or 200 prize. Just call 09066361921,spam
You have won ?1,000 cash or a ?2,000 prize! To claim, call09050000327,spam
Talk sexy!! Make new friends or fall in love in the worlds most discreet text dating service. Just text VIP to 83110 and see who you could meet.,spam
Todays Vodafone numbers ending with 4882 are selected to a receive a 350 award. If your number matches call 09064019014 to receive your 350 award.,spam
GENT! We are trying to contact you. Last weekends draw shows that you won a 1000 prize GUARANTEED. Call 09064012160. Claim Code K52. Valid 12hrs only. 150ppm ,spam
Camera - You are awarded a SiPix Digital Camera! call 09061221066 fromm landline. Delivery within 28 days.,spam
YOU VE WON! Your 4* Costa Del Sol Holiday or 5000 await collection. Call 09050090044 Now toClaim. SAE, TC s, POBox334, Stockport, SK38xh, Cost1.50/pm, Max10mins,spam
WELL DONE! Your 4* Costa Del Sol Holiday or 5000 await collection. Call 09050090044 Now toClaim. SAE, TCs, POBox334, Stockport, SK38xh, Cost1.50/pm, Max10mins,spam
Camera - You are awarded a SiPix Digital Camera! call 09061221066 fromm landline. Delivery within 28 days,spam
Congratulations ur awarded 500 of CD vouchers or 125gift guaranteed & Free entry 2 100 wkly draw txt MUSIC to 87066,spam
Loan for any purpose 500 - 75,000. Homeowners + Tenants welcome. Have you been previously refused? We can still help. Call Free 0800 1956669 or text back 'help',spam
This is the 2nd time we have tried 2 contact u. U have won the 750 Pound prize. 2 claim is easy, call 087187272008 NOW1! Only 10p per minute. BT-national-rate.,spam
Congrats! 1 year special cinema pass for 2 is yours. call 09061209465 now! C Suprman V, Matrix3, StarWars3, etc all 4 FREE! bx420-ip4-5we. 150pm. Dont miss out!,spam
Message Important information for O2 user. Today is your lucky day! 2 find out why log onto http://www.urawinner.com there is a fantastic surprise awaiting you,spam
Had your mobile 11 months or more? U R entitled to Update to the latest colour mobiles with camera for Free! Call The Mobile Update Co FREE on 08002986030,spam
Bloomberg -Message center +447797706009 Why wait? Apply for your future http://careers. bloomberg.com,spam
Sppok up ur mob with a Halloween collection of nokia logo&pic message plus a FREE eerie tone, txt CARD SPOOK to 8007,spam
25p 4 alfie Moon's Children in need song on ur mob. Tell ur m8s. Txt Tone charity to 8007 for Nokias or Poly charity for polys: zed 08701417012 profit 2 charity.,spam
URGENT!: Your Mobile No. was awarded a 2,000 Bonus Caller Prize on 02/09/03! This is our 2nd attempt to contact YOU! Call 0871-872-9755 BOX95QU,spam
Phony 350 award - Todays Voda numbers ending XXXX are selected to receive a 350 award. If you have a match please call 08712300220 quoting claim code 3100 standard rates app,spam
we tried to contact you re your response to our offer of a new nokia fone and camcorder hit reply or call 08000930705 for delivery,spam
Hello from Orange. For 1 month's free access to games, news and sport, plus 10 free texts and 20 photo messages, reply YES. Terms apply: www.orange.co.uk/ow,spam
Ur HMV Quiz cash-balance is currently 500 - to maximize ur cash-in now send HMV1 to 86688 only 150p/msg,spam
YOU HAVE WON! As a valued Vodafone customer our computer has picked YOU to win a 150 prize. To collect is easy. Just call 09061743386,spam
Congratulations ur awarded either a yrs supply of CDs from Virgin Records or a Mystery Gift GUARANTEED Call 09061104283 Ts&Cs www.smsco.net 1.50pm approx 3mins,spam
A 400 XMAS REWARD IS WAITING FOR YOU! Our computer has randomly picked you from our loyal mobile customers to receive a 400 reward. Just call 09066380611 ,spam
December only! Had your mobile 11mths+? You are entitled to update to the latest colour camera mobile for Free! Call The Mobile Update Co FREE on 08002986906,spam
74355 XMAS iscoming & ur awarded either 500 CD gift vouchers & free entry 2 r 100 weekly draw txt MUSIC to 87066 TnC,spam
SIX chances to win CASH! From 100 to 20,000 pounds txt> CSH11 and send to 87575. Cost 150p/day, 6days, 16+ TsandCs apply Reply HL 4 info,spam
Todays Voda numbers ending 7548 are selected to receive a $350 award. If you have a match please call 08712300220 quoting claim code 4041 standard rates app,spam
Congratulations! Thanks to a good friend U have WON the 2,000 Xmas prize. 2 claim is easy, just call 08718726978 NOW! Only 10p per minute. BT-national-rate,spam
You have WON a guaranteed 1000 cash or a 2000 prize. To claim yr prize call our customer service representative on 08714712379 between 10am-7pm Cost 10p,spam
You are a winner you have been specially selected to receive 1000 cash or a 2000 award. Speak to a live operator to claim call 087147123779am-7pm. Cost 10p,spam
INTERFLORA - It's not too late to order Interflora flowers for christmas call 0800 505060 to place your order before Midnight tomorrow.,spam
8007 FREE for 1st week! No1 Nokia tone 4 ur mob every week just txt NOKIA to 8007 Get txting and tell ur mates www.getzed.co.uk POBox 36504 W4 5WQ norm 150p/tone 16+,spam
Congratulations ur awarded either 500 of CD gift vouchers & Free entry 2 our 100 weekly draw txt MUSIC to 87066 TnCs www.Ldew.com 1 win150ppmx3age16,spam
"For the most sparkling shopping breaks from 45 per person; call 0121 2025050 or visit www.shortbreaks.org.uk",spam
Are you unique enough? Find out from 30th August. www.areyouunique.co.uk,spam
WINNER! As a valued network customer you hvae been selected to receive a 900 reward! To collect call 09061701444. Valid 24 hours only. ACL03530150PM,spam
Congratulations U can claim 2 VIP row A Tickets 2 C Blu in concert in November or Blu gift guaranteed Call 09061104276 to claim TS&Cs www.smsco.net cost3.75max ,spam
This is the 2nd time we have tried to contact u. U have won the 1450 prize to claim just call 09053750005 b4 310303. T&Cs/stop SMS 08718725756. 140ppm,spam
Urgent Ur 500 guaranteed award is still unclaimed! Call 09066368327 NOW closingdate04/09/02 claimcode M39M51 1.50pmmorefrommobile2Bremoved-MobyPOBox734LS27YF,spam
If you don't, your prize will go to another customer. T&C at www.t-c.biz 18+ 150p/min Polo Ltd Suite 373 London W1J 6HL Please call back if busy,spam
No 1 POLYPHONIC tone 4 ur mob every week! Just txt PT2 to 87575. 1st Tone FREE ! so get txtin now and tell ur friends. 150p/tone. 16 reply HL 4info,spam
I don't know u and u don't know me. Send CHAT to 86688 now and let's find each other! Only 150p/Msg rcvd. HG/Suite342/2Lands/Row/W1J6HL LDN. 18 years or over.,spam
Send a logo 2 ur lover - 2 names joined by a heart. Txt LOVE NAME1 NAME2 MOBNO eg LOVE ADAM EVE 07123456789 to 87077 Yahoo! POBox36504W45WQ TxtNO 4 no ads 150p,spam
HMV BONUS SPECIAL 500 pounds of genuine HMV vouchers to be won. Just answer 4 easy questions. Play Now! Send HMV to 86688 More info:www.100percent-real.com,spam
Please call our customer service representative on 0800 169 6031 between 10am-9pm as you have WON a guaranteed 1000 cash or 5000 prize!,spam
You are being contacted by our dating service by someone you know! To find out who it is, call from a land line 09050000878. PoBox45W2TG150P,spam
83039 62735=450 UK Break AccommodationVouchers terms & conditions apply. 2 claim you mustprovide your claim number which is 15541 ,spam
You have an important customer service announcement from PREMIER. Call FREEPHONE 0800 542 0578 now!,spam
You are awarded a SiPix Digital Camera! call 09061221061 from landline. Delivery within 28days. T Cs Box177. M221BP. 2yr warranty. 150ppm. 16 . p p3.99,spam
Please call our customer service representative on FREEPHONE 0808 145 4742 between 9am-11pm as you have WON a guaranteed 1000 cash or 5000 prize!,spam
You are a winner U have been specially selected 2 receive 1000 cash or a 4* holiday (flights inc) speak to a live operator 2 claim 0871277810810,spam
"Hey sorry I didntgive ya a a bellearlier hunny,just been in bedbut mite go 2 thepub l8tr if uwana mt up?loads a luv Jenxxx.",ham
"Are you comingdown later?",ham
"HEY HEY WERETHE MONKEESPEOPLE SAY WE MONKEYAROUND! HOWDY GORGEOUS, HOWU DOIN? FOUNDURSELF A JOBYET SAUSAGE?LOVE JEN XXX",ham
"CHA QUITEAMUZING THATSCOOL BABE,PROBPOP IN & CU SATTHEN HUNNY 4BREKKIE! LOVE JEN XXX. PSXTRA LRG PORTIONS 4 ME PLEASE ",ham
"HEY BABE! FAR 2 SPUN-OUT 2 SPK AT DA MO... DEAD 2 DA WRLD. BEEN SLEEPING ON DA SOFA ALL DAY, HAD A COOL NYTHO, TX 4 FONIN HON, CALL 2MWEN IM BK FRMCLOUD 9! J X",ham
"CHEERS U TEX MECAUSE U WEREBORED! YEAH OKDEN HUNNY R UIN WK SAT?SOUNDS LIKEYOUR HAVIN GR8FUN J! KEEP UPDAT COUNTINLOTS OF LOVEME XXXXX.",ham
"EY! CALM DOWNON THEACUSATIONS.. ITXT U COS IWANA KNOW WOTU R DOIN AT THEW/END... HAVENTCN U IN AGES..RING ME IF UR UP4 NETHING SAT.LOVE J XXX.",ham
"YEH I AM DEF UP4 SOMETHING SAT,JUST GOT PAYED2DAY & I HAVBEEN GIVEN A50 PAY RISE 4MY WORK & HAVEBEEN MADE PRESCHOOLCO-ORDINATOR 2I AM FEELINGOOD LUV",ham
"Hi its Kate it was lovely to see you tonight and ill phone you tomorrow. I got to sing and a guy gave me his card! xxx",ham
"Thinking of u ;) x",ham
Me too! Have a lovely night xxx,ham
Hey hun-onbus goin 2 meet him. He wants 2go out 4a meal but I donyt feel like it cuz have 2 get last bus home!But hes sweet latelyxxx,ham
Hi mate its RV did u hav a nice hol just a message 3 say hello coz havent sent u 1 in ages started driving so stay off roads!RVx,ham
IM FINE BABES AINT BEEN UP 2 MUCH THO! SAW SCARY MOVIE YEST ITS QUITE FUNNY! WANT 2MRW AFTERNOON? AT TOWN OR MALL OR SUMTHIN?xx,ham
I notice you like looking in the shit mirror youre turning into a right freak,ham
IM LATE TELLMISS IM ON MY WAY,ham
Been up to ne thing interesting. Did you have a good birthday? When are u wrking nxt? I started uni today.,ham
IM GONNAMISSU SO MUCH!!I WOULD SAY IL SEND U A POSTCARD BUTTHERES ABOUTAS MUCH CHANCE OF MEREMEMBERIN ASTHERE IS OFSI NOT BREAKIN HIS CONTRACT!! LUV Yaxx,ham
Thanx 4 the time weve spent 2geva, its bin mint! Ur my Baby and all I want is u!xxxx,ham
You stayin out of trouble stranger!!saw Dave the other day hes sorted now!still with me bloke when u gona get a girl MR!ur mum still Thinks we will get 2GETHA! ,ham
THANX 4 PUTTIN DA FONE DOWN ON ME!!,ham
I know dat feelin had it with Pete! Wuld get with em , nuther place nuther time mayb?,ham
U 2.,ham
Thanx u darlin!im cool thanx. A few bday drinks 2 nite. 2morrow off! Take care c u soon.xxx,ham
HIYA COMIN 2 BRISTOL 1 ST WEEK IN APRIL. LES GOT OFF + RUDI ON NEW YRS EVE BUT I WAS SNORING.THEY WERE DRUNK! U BAK AT COLLEGE YET? MY WORK SENDS INK 2 BATH.,ham
Sez, hows u & de arab boy? Hope u r all good give my love 2 evry1 love ya eshxxxxxxxxxxx,ham
THING R GOOD THANX GOT EXAMS IN MARCH IVE DONE NO REVISION? IS FRAN STILL WITH BOYF? IVE GOTTA INTERVIW 4 EXETER BIT WORRIED!x,ham
I love u 2 babe! R u sure everything is alrite. Is he being an idiot? Txt bak girlie,ham
I luv u soo much u dont understand how special u r 2 me ring u 2morrow luv u xxx,ham
NOT MUCH NO FIGHTS. IT WAS A GOOD NITE!!,ham
================================================
FILE: decision_tree/lenses.dot
================================================
digraph decision_tree {
"99d3b650-7557-420c-be5f-037403909eef" [label="tearRate"];
"ccf5c62e-14ca-4cef-9525-4b8f026622dc" [label="no lenses"];
"6a72f3f9-51ce-4433-b052-34765c65a61e" [label="astigmatic"];
"91ea78df-9cfd-4334-a592-1c8b3c193f0d" [label="age"];
"b5d2e2b7-241b-4c46-a56b-61ba9a1e7678" [label="soft"];
"62193a33-c49d-4bce-b820-1613685e09ce" [label="soft"];
"01240d64-7b96-40fc-9a4b-185cc0fca9d6" [label="prescript"];
"5571119a-43b5-414e-9bf5-c9c62a9dee8c" [label="soft"];
"087246f9-495f-44ef-8ea0-5043b238c1c1" [label="no lenses"];
"c0b04ca3-692d-4498-8292-165ed4997ce5" [label="prescript"];
"8f6cfe1f-a0ea-46de-a456-f3f8b35bca8d" [label="age"];
"4d2b5c7f-e85e-4d44-8da9-0d88de048430" [label="hard"];
"cdc375a5-561f-48c9-a847-ccc73f1cc44c" [label="no lenses"];
"4600fda0-b8a8-45cc-8174-d554de9b7e84" [label="no lenses"];
"08a19fa5-952c-4ab3-a283-dbfe2e3e5870" [label="hard"];
"99d3b650-7557-420c-be5f-037403909eef" -> "ccf5c62e-14ca-4cef-9525-4b8f026622dc" [label="reduced"];
"99d3b650-7557-420c-be5f-037403909eef" -> "6a72f3f9-51ce-4433-b052-34765c65a61e" [label="normal"];
"6a72f3f9-51ce-4433-b052-34765c65a61e" -> "91ea78df-9cfd-4334-a592-1c8b3c193f0d" [label="no"];
"91ea78df-9cfd-4334-a592-1c8b3c193f0d" -> "b5d2e2b7-241b-4c46-a56b-61ba9a1e7678" [label="young"];
"91ea78df-9cfd-4334-a592-1c8b3c193f0d" -> "62193a33-c49d-4bce-b820-1613685e09ce" [label="pre"];
"91ea78df-9cfd-4334-a592-1c8b3c193f0d" -> "01240d64-7b96-40fc-9a4b-185cc0fca9d6" [label="presbyopic"];
"01240d64-7b96-40fc-9a4b-185cc0fca9d6" -> "5571119a-43b5-414e-9bf5-c9c62a9dee8c" [label="hyper"];
"01240d64-7b96-40fc-9a4b-185cc0fca9d6" -> "087246f9-495f-44ef-8ea0-5043b238c1c1" [label="myope"];
"6a72f3f9-51ce-4433-b052-34765c65a61e" -> "c0b04ca3-692d-4498-8292-165ed4997ce5" [label="yes"];
"c0b04ca3-692d-4498-8292-165ed4997ce5" -> "8f6cfe1f-a0ea-46de-a456-f3f8b35bca8d" [label="hyper"];
"8f6cfe1f-a0ea-46de-a456-f3f8b35bca8d" -> "4d2b5c7f-e85e-4d44-8da9-0d88de048430" [label="young"];
"8f6cfe1f-a0ea-46de-a456-f3f8b35bca8d" -> "cdc375a5-561f-48c9-a847-ccc73f1cc44c" [label="pre"];
"8f6cfe1f-a0ea-46de-a456-f3f8b35bca8d" -> "4600fda0-b8a8-45cc-8174-d554de9b7e84" [label="presbyopic"];
"c0b04ca3-692d-4498-8292-165ed4997ce5" -> "08a19fa5-952c-4ab3-a283-dbfe2e3e5870" [label="myope"];
}
================================================
FILE: decision_tree/lenses.py
================================================
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from trees import DecisionTreeClassifier
lense_labels = ['age', 'prescript', 'astigmatic', 'tearRate']
X = []
Y = []
with open('lenses.txt', 'r') as f:
for line in f:
comps = line.strip().split('\t')
X.append(comps[: -1])
Y.append(comps[-1])
clf = DecisionTreeClassifier()
clf.create_tree(X, Y, lense_labels)
================================================
FILE: decision_tree/lenses.txt
================================================
young myope no reduced no lenses
young myope no normal soft
young myope yes reduced no lenses
young myope yes normal hard
young hyper no reduced no lenses
young hyper no normal soft
young hyper yes reduced no lenses
young hyper yes normal hard
pre myope no reduced no lenses
pre myope no normal soft
pre myope yes reduced no lenses
pre myope yes normal hard
pre hyper no reduced no lenses
pre hyper no normal soft
pre hyper yes reduced no lenses
pre hyper yes normal no lenses
presbyopic myope no reduced no lenses
presbyopic myope no normal no lenses
presbyopic myope yes reduced no lenses
presbyopic myope yes normal hard
presbyopic hyper no reduced no lenses
presbyopic hyper no normal soft
presbyopic hyper yes reduced no lenses
presbyopic hyper yes normal no lenses
================================================
FILE: decision_tree/sms_tree.dot
================================================
digraph decision_tree {
"959b4c0c-1821-446d-94a1-c619c2decfcd" [label="call"];
"18665160-b058-437f-9b2e-05df2eb55661" [label="to"];
"2eb9860d-d241-45ca-85e6-cbd80fe2ebf7" [label="your"];
"bcbcc17c-9e2a-4bd4-a039-6e51fde5f8fd" [label="areyouunique"];
"ca091fc7-8a4e-4970-9ec3-485a4628ad29" [label="02073162414"];
"aac20872-1aac-499d-b2b5-caf0ef56eff3" [label="ham"];
"18aa8685-a6e8-4d76-bad5-ccea922bb14d" [label="spam"];
"3f7f30b1-4dbb-4459-9f25-358ad3c6d50b" [label="spam"];
"44d1f972-cd97-4636-b6e6-a389bf560656" [label="spam"];
"7f3c8562-69b5-47a9-8ee4-898bd4b6b506" [label="i"];
"a6f22325-8841-4a81-bc04-4e7485117aa1" [label="spam"];
"c181fe42-fd3c-48db-968a-502f8dd462a4" [label="ldn"];
"51b9477a-0326-4774-8622-24d1d869a283" [label="ham"];
"16f6aecd-c675-4291-867c-6c64d27eb3fc" [label="spam"];
"adb05303-813a-4fe0-bf98-c319eb70be48" [label="spam"];
"959b4c0c-1821-446d-94a1-c619c2decfcd" -> "18665160-b058-437f-9b2e-05df2eb55661" [label="0"];
"18665160-b058-437f-9b2e-05df2eb55661" -> "2eb9860d-d241-45ca-85e6-cbd80fe2ebf7" [label="0"];
"2eb9860d-d241-45ca-85e6-cbd80fe2ebf7" -> "bcbcc17c-9e2a-4bd4-a039-6e51fde5f8fd" [label="0"];
"bcbcc17c-9e2a-4bd4-a039-6e51fde5f8fd" -> "ca091fc7-8a4e-4970-9ec3-485a4628ad29" [label="0"];
"ca091fc7-8a4e-4970-9ec3-485a4628ad29" -> "aac20872-1aac-499d-b2b5-caf0ef56eff3" [label="0"];
"ca091fc7-8a4e-4970-9ec3-485a4628ad29" -> "18aa8685-a6e8-4d76-bad5-ccea922bb14d" [label="1"];
"bcbcc17c-9e2a-4bd4-a039-6e51fde5f8fd" -> "3f7f30b1-4dbb-4459-9f25-358ad3c6d50b" [label="1"];
"2eb9860d-d241-45ca-85e6-cbd80fe2ebf7" -> "44d1f972-cd97-4636-b6e6-a389bf560656" [label="1"];
"18665160-b058-437f-9b2e-05df2eb55661" -> "7f3c8562-69b5-47a9-8ee4-898bd4b6b506" [label="1"];
"7f3c8562-69b5-47a9-8ee4-898bd4b6b506" -> "a6f22325-8841-4a81-bc04-4e7485117aa1" [label="0"];
"7f3c8562-69b5-47a9-8ee4-898bd4b6b506" -> "c181fe42-fd3c-48db-968a-502f8dd462a4" [label="1"];
"c181fe42-fd3c-48db-968a-502f8dd462a4" -> "51b9477a-0326-4774-8622-24d1d869a283" [label="0"];
"c181fe42-fd3c-48db-968a-502f8dd462a4" -> "16f6aecd-c675-4291-867c-6c64d27eb3fc" [label="1"];
"959b4c0c-1821-446d-94a1-c619c2decfcd" -> "adb05303-813a-4fe0-bf98-c319eb70be48" [label="1"];
}
================================================
FILE: decision_tree/sms_tree.py
================================================
#!/usr/bin/env python
# -*- coding: utf-8 -*-
''' 通过垃圾短信数据训练朴素贝叶斯模型,并进行留存交叉验证
'''
import re
import random
import os
import numpy as np
import matplotlib.pyplot as plt
from trees import DecisionTreeClassifier
ENCODING = 'ISO-8859-1'
TRAIN_PERCENTAGE = 0.9
def get_doc_vector(words, vocabulary):
''' 根据词汇表将文档中的词条转换成文档向量
:param words: 文档中的词条列表
:type words: list of str
:param vocabulary: 总的词汇列表
:type vocabulary: list of str
:return doc_vect: 用于贝叶斯分析的文档向量
:type doc_vect: list of int
'''
doc_vect = [0]*len(vocabulary)
for word in words:
if word in vocabulary:
idx = vocabulary.index(word)
doc_vect[idx] = 1
return doc_vect
def parse_line(line):
''' 解析数据集中的每一行返回词条向量和短信类型.
'''
cls = line.split(',')[-1].strip()
content = ','.join(line.split(',')[: -1])
word_vect = [word.lower() for word in re.split(r'\W+', content) if word]
return word_vect, cls
def parse_file(filename):
''' 解析文件中的数据
'''
vocabulary, word_vects, classes = [], [], []
with open(filename, 'r', encoding=ENCODING) as f:
for line in f:
if line:
word_vect, cls = parse_line(line)
vocabulary.extend(word_vect)
word_vects.append(word_vect)
classes.append(cls)
vocabulary = list(set(vocabulary))
return vocabulary, word_vects, classes
if '__main__' == __name__:
clf = DecisionTreeClassifier()
vocabulary, word_vects, classes = parse_file('english_big.txt')
# 训练数据 & 测试数据
ntest = int(len(classes)*(1-TRAIN_PERCENTAGE))
test_word_vects = []
test_classes = []
for i in range(ntest):
idx = random.randint(0, len(word_vects)-1)
test_word_vects.append(word_vects.pop(idx))
test_classes.append(classes.pop(idx))
train_word_vects = word_vects
train_classes = classes
train_dataset = [get_doc_vector(words, vocabulary) for words in train_word_vects]
# 生成决策树
if not os.path.exists('sms_tree.pkl'):
clf.create_tree(train_dataset, train_classes, vocabulary)
clf.dump_tree('sms_tree.pkl')
else:
clf.load_tree('sms_tree.pkl')
# 测试模型
error = 0
for test_word_vect, test_cls in zip(test_word_vects, test_classes):
test_data = get_doc_vector(test_word_vect, vocabulary)
pred_cls = clf.classify(test_data, feat_names=vocabulary)
if test_cls != pred_cls:
print('Predict: {} -- Actual: {}'.format(pred_cls, test_cls))
error += 1
print('Error Rate: {}'.format(error/len(test_classes)))
================================================
FILE: decision_tree/sms_tree_2.dot
================================================
digraph decision_tree {
"8fbb40df-9b8c-4525-a34a-0ec254360649" [label="call"];
"9c2cf1a6-e34b-4f3c-9cc0-17a4e20f12f7" [label="to"];
"ef9e9738-2596-4bdf-a42d-28d7f5471ca7" [label="your"];
"626c0a8b-c1fe-42d9-ad4f-b79e03cf82f7" [label="from"];
"56bdce7c-b23c-4d52-a802-1c28377ec7f5" [label="explicit"];
"ebe24cea-1310-40fe-b164-25032c942aec" [label="ham"];
"1a56632b-860b-4ace-b604-59b9c3b06405" [label="spam"];
"d7636d96-6f9e-4883-a581-c8919088cbf2" [label="spam"];
"1d1933b4-12e1-41ea-b6c1-46f8bacb851c" [label="spam"];
"ac8ca11e-10f5-4a3f-8c1d-e31933d74a8d" [label="when"];
"00cb082b-b9c3-4417-9d25-209f2b4957c8" [label="spam"];
"b7cc5eda-0d6a-4893-ba1d-78641ed8a949" [label="ham"];
"577ef6a5-eb97-4dc1-9fae-741253db33aa" [label="dead"];
"ae9e2b6c-1bdb-4cdb-aaea-01f0d3e138c6" [label="spam"];
"6c303284-fb0a-44e7-b92d-dcae4ffd828d" [label="ham"];
"8fbb40df-9b8c-4525-a34a-0ec254360649" -> "9c2cf1a6-e34b-4f3c-9cc0-17a4e20f12f7" [label="0"];
"9c2cf1a6-e34b-4f3c-9cc0-17a4e20f12f7" -> "ef9e9738-2596-4bdf-a42d-28d7f5471ca7" [label="0"];
"ef9e9738-2596-4bdf-a42d-28d7f5471ca7" -> "626c0a8b-c1fe-42d9-ad4f-b79e03cf82f7" [label="0"];
"626c0a8b-c1fe-42d9-ad4f-b79e03cf82f7" -> "56bdce7c-b23c-4d52-a802-1c28377ec7f5" [label="0"];
"56bdce7c-b23c-4d52-a802-1c28377ec7f5" -> "ebe24cea-1310-40fe-b164-25032c942aec" [label="0"];
"56bdce7c-b23c-4d52-a802-1c28377ec7f5" -> "1a56632b-860b-4ace-b604-59b9c3b06405" [label="1"];
"626c0a8b-c1fe-42d9-ad4f-b79e03cf82f7" -> "d7636d96-6f9e-4883-a581-c8919088cbf2" [label="1"];
"ef9e9738-2596-4bdf-a42d-28d7f5471ca7" -> "1d1933b4-12e1-41ea-b6c1-46f8bacb851c" [label="1"];
"9c2cf1a6-e34b-4f3c-9cc0-17a4e20f12f7" -> "ac8ca11e-10f5-4a3f-8c1d-e31933d74a8d" [label="1"];
"ac8ca11e-10f5-4a3f-8c1d-e31933d74a8d" -> "00cb082b-b9c3-4417-9d25-209f2b4957c8" [label="0"];
"ac8ca11e-10f5-4a3f-8c1d-e31933d74a8d" -> "b7cc5eda-0d6a-4893-ba1d-78641ed8a949" [label="1"];
"8fbb40df-9b8c-4525-a34a-0ec254360649" -> "577ef6a5-eb97-4dc1-9fae-741253db33aa" [label="1"];
"577ef6a5-eb97-4dc1-9fae-741253db33aa" -> "ae9e2b6c-1bdb-4cdb-aaea-01f0d3e138c6" [label="0"];
"577ef6a5-eb97-4dc1-9fae-741253db33aa" -> "6c303284-fb0a-44e7-b92d-dcae4ffd828d" [label="1"];
}
================================================
FILE: decision_tree/trees.py
================================================
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Author: PytLab <shaozhengjiang@gmail.com>
# Date: 2017-07-07
import copy
import uuid
import pickle
from collections import defaultdict, namedtuple
from math import log2
class DecisionTreeClassifier(object):
''' 使用ID3算法划分数据集的决策树分类器
'''
@staticmethod
def split_dataset(dataset, classes, feat_idx):
''' 根据某个特征以及特征值划分数据集
:param dataset: 待划分的数据集, 有数据向量组成的列表.
:param classes: 数据集对应的类型, 与数据集有相同的长度
:param feat_idx: 特征在特征向量中的索引
:param splited_dict: 保存分割后数据的字典 特征值: [子数据集, 子类型列表]
'''
splited_dict = {}
for data_vect, cls in zip(dataset, classes):
feat_val = data_vect[feat_idx]
sub_dataset, sub_classes = splited_dict.setdefault(feat_val, [[], []])
sub_dataset.append(data_vect[: feat_idx] + data_vect[feat_idx+1: ])
sub_classes.append(cls)
return splited_dict
def get_shanno_entropy(self, values):
''' 根据给定列表中的值计算其Shanno Entropy
'''
uniq_vals = set(values)
val_nums = {key: values.count(key) for key in uniq_vals}
probs = [v/len(values) for k, v in val_nums.items()]
entropy = sum([-prob*log2(prob) for prob in probs])
return entropy
def choose_best_split_feature(self, dataset, classes):
''' 根据信息增益确定最好的划分数据的特征
:param dataset: 待划分的数据集
:param classes: 数据集对应的类型
:return: 划分数据的增益最大的属性索引
'''
base_entropy = self.get_shanno_entropy(classes)
feat_num = len(dataset[0])
entropy_gains = []
for i in range(feat_num):
splited_dict = self.split_dataset(dataset, classes, i)
new_entropy = sum([
len(sub_classes)/len(classes)*self.get_shanno_entropy(sub_classes)
for _, (_, sub_classes) in splited_dict.items()
])
entropy_gains.append(base_entropy - new_entropy)
return entropy_gains.index(max(entropy_gains))
def get_majority(classes):
''' 返回类型中占据大多数的类型
'''
cls_num = defaultdict(lambda: 0)
for cls in classes:
cls_num[cls] += 1
return max(cls_num, key=cls_num.get)
def create_tree(self, dataset, classes, feat_names):
''' 根据当前数据集递归创建决策树
:param dataset: 数据集
:param feat_names: 数据集中数据相应的特征名称
:param classes: 数据集中数据相应的类型
:param tree: 以字典形式返回决策树
'''
# 如果数据集中只有一种类型停止树分裂
if len(set(classes)) == 1:
return classes[0]
# 如果遍历完所有特征,返回比例最多的类型
if len(feat_names) == 0:
return get_majority(classes)
# 分裂创建新的子树
tree = {}
best_feat_idx = self.choose_best_split_feature(dataset, classes)
feature = feat_names[best_feat_idx]
tree[feature] = {}
# 创建用于递归创建子树的子数据集
sub_feat_names = feat_names[:]
sub_feat_names.pop(best_feat_idx)
splited_dict = self.split_dataset(dataset, classes, best_feat_idx)
for feat_val, (sub_dataset, sub_classes) in splited_dict.items():
tree[feature][feat_val] = self.create_tree(sub_dataset,
sub_classes,
sub_feat_names)
self.tree = tree
self.feat_names = feat_names
return tree
def get_nodes_edges(self, tree=None, root_node=None):
''' 返回树中所有节点和边
'''
Node = namedtuple('Node', ['id', 'label'])
Edge = namedtuple('Edge', ['start', 'end', 'label'])
if tree is None:
tree = self.tree
if type(tree) is not dict:
return [], []
nodes, edges = [], []
if root_node is None:
label = list(tree.keys())[0]
root_node = Node._make([uuid.uuid4(), label])
nodes.append(root_node)
for edge_label, sub_tree in tree[root_node.label].items():
node_label = list(sub_tree.keys())[0] if type(sub_tree) is dict else sub_tree
sub_node = Node._make([uuid.uuid4(), node_label])
nodes.append(sub_node)
edge = Edge._make([root_node, sub_node, edge_label])
edges.append(edge)
sub_nodes, sub_edges = self.get_nodes_edges(sub_tree, root_node=sub_node)
nodes.extend(sub_nodes)
edges.extend(sub_edges)
return nodes, edges
def dotify(self, tree=None):
''' 获取树的Graphviz Dot文件的内容
'''
if tree is None:
tree = self.tree
content = 'digraph decision_tree {\n'
nodes, edges = self.get_nodes_edges(tree)
for node in nodes:
content += ' "{}" [label="{}"];\n'.format(node.id, node.label)
for edge in edges:
start, label, end = edge.start, edge.label, edge.end
content += ' "{}" -> "{}" [label="{}"];\n'.format(start.id, end.id, label)
content += '}'
return content
def classify(self, data_vect, feat_names=None, tree=None):
''' 根据构建的决策树对数据进行分类
'''
if tree is None:
tree = self.tree
if feat_names is None:
feat_names = self.feat_names
# Recursive base case.
if type(tree) is not dict:
return tree
feature = list(tree.keys())[0]
value = data_vect[feat_names.index(feature)]
sub_tree = tree[feature][value]
return self.classify(data_vect, feat_names, sub_tree)
def dump_tree(self, filename, tree=None):
''' 存储决策树
'''
if tree is None:
tree = self.tree
with open(filename, 'wb') as f:
pickle.dump(tree, f)
def load_tree(self, filename):
''' 加载树结构
'''
with open(filename, 'rb') as f:
tree = pickle.load(f)
self.tree = tree
return tree
================================================
FILE: linear_regression/abalone.txt
================================================
1 0.455 0.365 0.095 0.514 0.2245 0.101 0.15 15
1 0.35 0.265 0.09 0.2255 0.0995 0.0485 0.07 7
-1 0.53 0.42 0.135 0.677 0.2565 0.1415 0.21 9
1 0.44 0.365 0.125 0.516 0.2155 0.114 0.155 10
0 0.33 0.255 0.08 0.205 0.0895 0.0395 0.055 7
0 0.425 0.3 0.095 0.3515 0.141 0.0775 0.12 8
-1 0.53 0.415 0.15 0.7775 0.237 0.1415 0.33 20
-1 0.545 0.425 0.125 0.768 0.294 0.1495 0.26 16
1 0.475 0.37 0.125 0.5095 0.2165 0.1125 0.165 9
-1 0.55 0.44 0.15 0.8945 0.3145 0.151 0.32 19
-1 0.525 0.38 0.14 0.6065 0.194 0.1475 0.21 14
1 0.43 0.35 0.11 0.406 0.1675 0.081 0.135 10
1 0.49 0.38 0.135 0.5415 0.2175 0.095 0.19 11
-1 0.535 0.405 0.145 0.6845 0.2725 0.171 0.205 10
-1 0.47 0.355 0.1 0.4755 0.1675 0.0805 0.185 10
1 0.5 0.4 0.13 0.6645 0.258 0.133 0.24 12
0 0.355 0.28 0.085 0.2905 0.095 0.0395 0.115 7
-1 0.44 0.34 0.1 0.451 0.188 0.087 0.13 10
1 0.365 0.295 0.08 0.2555 0.097 0.043 0.1 7
1 0.45 0.32 0.1 0.381 0.1705 0.075 0.115 9
1 0.355 0.28 0.095 0.2455 0.0955 0.062 0.075 11
0 0.38 0.275 0.1 0.2255 0.08 0.049 0.085 10
-1 0.565 0.44 0.155 0.9395 0.4275 0.214 0.27 12
-1 0.55 0.415 0.135 0.7635 0.318 0.21 0.2 9
-1 0.615 0.48 0.165 1.1615 0.513 0.301 0.305 10
-1 0.56 0.44 0.14 0.9285 0.3825 0.188 0.3 11
-1 0.58 0.45 0.185 0.9955 0.3945 0.272 0.285 11
1 0.59 0.445 0.14 0.931 0.356 0.234 0.28 12
1 0.605 0.475 0.18 0.9365 0.394 0.219 0.295 15
1 0.575 0.425 0.14 0.8635 0.393 0.227 0.2 11
1 0.58 0.47 0.165 0.9975 0.3935 0.242 0.33 10
-1 0.68 0.56 0.165 1.639 0.6055 0.2805 0.46 15
1 0.665 0.525 0.165 1.338 0.5515 0.3575 0.35 18
-1 0.68 0.55 0.175 1.798 0.815 0.3925 0.455 19
-1 0.705 0.55 0.2 1.7095 0.633 0.4115 0.49 13
1 0.465 0.355 0.105 0.4795 0.227 0.124 0.125 8
-1 0.54 0.475 0.155 1.217 0.5305 0.3075 0.34 16
-1 0.45 0.355 0.105 0.5225 0.237 0.1165 0.145 8
-1 0.575 0.445 0.135 0.883 0.381 0.2035 0.26 11
1 0.355 0.29 0.09 0.3275 0.134 0.086 0.09 9
-1 0.45 0.335 0.105 0.425 0.1865 0.091 0.115 9
-1 0.55 0.425 0.135 0.8515 0.362 0.196 0.27 14
0 0.24 0.175 0.045 0.07 0.0315 0.0235 0.02 5
0 0.205 0.15 0.055 0.042 0.0255 0.015 0.012 5
0 0.21 0.15 0.05 0.042 0.0175 0.0125 0.015 4
0 0.39 0.295 0.095 0.203 0.0875 0.045 0.075 7
1 0.47 0.37 0.12 0.5795 0.293 0.227 0.14 9
-1 0.46 0.375 0.12 0.4605 0.1775 0.11 0.15 7
0 0.325 0.245 0.07 0.161 0.0755 0.0255 0.045 6
-1 0.525 0.425 0.16 0.8355 0.3545 0.2135 0.245 9
0 0.52 0.41 0.12 0.595 0.2385 0.111 0.19 8
1 0.4 0.32 0.095 0.303 0.1335 0.06 0.1 7
1 0.485 0.36 0.13 0.5415 0.2595 0.096 0.16 10
-1 0.47 0.36 0.12 0.4775 0.2105 0.1055 0.15 10
1 0.405 0.31 0.1 0.385 0.173 0.0915 0.11 7
-1 0.5 0.4 0.14 0.6615 0.2565 0.1755 0.22 8
1 0.445 0.35 0.12 0.4425 0.192 0.0955 0.135 8
1 0.47 0.385 0.135 0.5895 0.2765 0.12 0.17 8
0 0.245 0.19 0.06 0.086 0.042 0.014 0.025 4
-1 0.505 0.4 0.125 0.583 0.246 0.13 0.175 7
1 0.45 0.345 0.105 0.4115 0.18 0.1125 0.135 7
1 0.505 0.405 0.11 0.625 0.305 0.16 0.175 9
-1 0.53 0.41 0.13 0.6965 0.302 0.1935 0.2 10
1 0.425 0.325 0.095 0.3785 0.1705 0.08 0.1 7
1 0.52 0.4 0.12 0.58 0.234 0.1315 0.185 8
1 0.475 0.355 0.12 0.48 0.234 0.1015 0.135 8
-1 0.565 0.44 0.16 0.915 0.354 0.1935 0.32 12
-1 0.595 0.495 0.185 1.285 0.416 0.224 0.485 13
-1 0.475 0.39 0.12 0.5305 0.2135 0.1155 0.17 10
0 0.31 0.235 0.07 0.151 0.063 0.0405 0.045 6
1 0.555 0.425 0.13 0.7665 0.264 0.168 0.275 13
-1 0.4 0.32 0.11 0.353 0.1405 0.0985 0.1 8
-1 0.595 0.475 0.17 1.247 0.48 0.225 0.425 20
1 0.57 0.48 0.175 1.185 0.474 0.261 0.38 11
-1 0.605 0.45 0.195 1.098 0.481 0.2895 0.315 13
-1 0.6 0.475 0.15 1.0075 0.4425 0.221 0.28 15
1 0.595 0.475 0.14 0.944 0.3625 0.189 0.315 9
-1 0.6 0.47 0.15 0.922 0.363 0.194 0.305 10
-1 0.555 0.425 0.14 0.788 0.282 0.1595 0.285 11
-1 0.615 0.475 0.17 1.1025 0.4695 0.2355 0.345 14
-1 0.575 0.445 0.14 0.941 0.3845 0.252 0.285 9
1 0.62 0.51 0.175 1.615 0.5105 0.192 0.675 12
-1 0.52 0.425 0.165 0.9885 0.396 0.225 0.32 16
1 0.595 0.475 0.16 1.3175 0.408 0.234 0.58 21
1 0.58 0.45 0.14 1.013 0.38 0.216 0.36 14
-1 0.57 0.465 0.18 1.295 0.339 0.2225 0.44 12
1 0.625 0.465 0.14 1.195 0.4825 0.205 0.4 13
1 0.56 0.44 0.16 0.8645 0.3305 0.2075 0.26 10
-1 0.46 0.355 0.13 0.517 0.2205 0.114 0.165 9
-1 0.575 0.45 0.16 0.9775 0.3135 0.231 0.33 12
1 0.565 0.425 0.135 0.8115 0.341 0.1675 0.255 15
1 0.555 0.44 0.15 0.755 0.307 0.1525 0.26 12
1 0.595 0.465 0.175 1.115 0.4015 0.254 0.39 13
-1 0.625 0.495 0.165 1.262 0.507 0.318 0.39 10
1 0.695 0.56 0.19 1.494 0.588 0.3425 0.485 15
1 0.665 0.535 0.195 1.606 0.5755 0.388 0.48 14
1 0.535 0.435 0.15 0.725 0.269 0.1385 0.25 9
1 0.47 0.375 0.13 0.523 0.214 0.132 0.145 8
1 0.47 0.37 0.13 0.5225 0.201 0.133 0.165 7
-1 0.475 0.375 0.125 0.5785 0.2775 0.085 0.155 10
0 0.36 0.265 0.095 0.2315 0.105 0.046 0.075 7
1 0.55 0.435 0.145 0.843 0.328 0.1915 0.255 15
1 0.53 0.435 0.16 0.883 0.316 0.164 0.335 15
1 0.53 0.415 0.14 0.724 0.3105 0.1675 0.205 10
1 0.605 0.47 0.16 1.1735 0.4975 0.2405 0.345 12
-1 0.52 0.41 0.155 0.727 0.291 0.1835 0.235 12
-1 0.545 0.43 0.165 0.802 0.2935 0.183 0.28 11
-1 0.5 0.4 0.125 0.6675 0.261 0.1315 0.22 10
-1 0.51 0.39 0.135 0.6335 0.231 0.179 0.2 9
-1 0.435 0.395 0.105 0.3635 0.136 0.098 0.13 9
1 0.495 0.395 0.125 0.5415 0.2375 0.1345 0.155 9
1 0.465 0.36 0.105 0.431 0.172 0.107 0.175 9
0 0.435 0.32 0.08 0.3325 0.1485 0.0635 0.105 9
1 0.425 0.35 0.105 0.393 0.13 0.063 0.165 9
-1 0.545 0.41 0.125 0.6935 0.2975 0.146 0.21 11
-1 0.53 0.415 0.115 0.5915 0.233 0.1585 0.18 11
-1 0.49 0.375 0.135 0.6125 0.2555 0.102 0.22 11
1 0.44 0.34 0.105 0.402 0.1305 0.0955 0.165 10
-1 0.56 0.43 0.15 0.8825 0.3465 0.172 0.31 9
1 0.405 0.305 0.085 0.2605 0.1145 0.0595 0.085 8
-1 0.47 0.365 0.105 0.4205 0.163 0.1035 0.14 9
0 0.385 0.295 0.085 0.2535 0.103 0.0575 0.085 7
-1 0.515 0.425 0.14 0.766 0.304 0.1725 0.255 14
1 0.37 0.265 0.075 0.214 0.09 0.051 0.07 6
0 0.36 0.28 0.08 0.1755 0.081 0.0505 0.07 6
0 0.27 0.195 0.06 0.073 0.0285 0.0235 0.03 5
0 0.375 0.275 0.09 0.238 0.1075 0.0545 0.07 6
0 0.385 0.29 0.085 0.2505 0.112 0.061 0.08 8
1 0.7 0.535 0.16 1.7255 0.63 0.2635 0.54 19
1 0.71 0.54 0.165 1.959 0.7665 0.261 0.78 18
1 0.595 0.48 0.165 1.262 0.4835 0.283 0.41 17
-1 0.44 0.35 0.125 0.4035 0.175 0.063 0.129 9
-1 0.325 0.26 0.09 0.1915 0.085 0.036 0.062 7
0 0.35 0.26 0.095 0.211 0.086 0.056 0.068 7
0 0.265 0.2 0.065 0.0975 0.04 0.0205 0.028 7
-1 0.425 0.33 0.115 0.406 0.1635 0.081 0.1355 8
-1 0.305 0.23 0.08 0.156 0.0675 0.0345 0.048 7
1 0.345 0.255 0.09 0.2005 0.094 0.0295 0.063 9
-1 0.405 0.325 0.11 0.3555 0.151 0.063 0.117 9
1 0.375 0.285 0.095 0.253 0.096 0.0575 0.0925 9
-1 0.565 0.445 0.155 0.826 0.341 0.2055 0.2475 10
-1 0.55 0.45 0.145 0.741 0.295 0.1435 0.2665 10
1 0.65 0.52 0.19 1.3445 0.519 0.306 0.4465 16
1 0.56 0.455 0.155 0.797 0.34 0.19 0.2425 11
1 0.475 0.375 0.13 0.5175 0.2075 0.1165 0.17 10
-1 0.49 0.38 0.125 0.549 0.245 0.1075 0.174 10
1 0.46 0.35 0.12 0.515 0.224 0.108 0.1565 10
0 0.28 0.205 0.08 0.127 0.052 0.039 0.042 9
0 0.175 0.13 0.055 0.0315 0.0105 0.0065 0.0125 5
0 0.17 0.13 0.095 0.03 0.013 0.008 0.01 4
1 0.59 0.475 0.145 1.053 0.4415 0.262 0.325 15
-1 0.605 0.5 0.185 1.1185 0.469 0.2585 0.335 9
-1 0.635 0.515 0.19 1.3715 0.5065 0.305 0.45 10
-1 0.605 0.485 0.16 1.0565 0.37 0.2355 0.355 10
-1 0.565 0.45 0.135 0.9885 0.387 0.1495 0.31 12
1 0.515 0.405 0.13 0.722 0.32 0.131 0.21 10
-1 0.575 0.46 0.19 0.994 0.392 0.2425 0.34 13
1 0.645 0.485 0.215 1.514 0.546 0.2615 0.
gitextract_2fv0thbp/
├── .gitignore
├── README.md
├── Reinforcement Learning/
│ ├── Calculating State Utilities.ipynb
│ ├── Calculating Transition Probabilities.ipynb
│ ├── Defining Initial Distribution.ipynb
│ ├── Policy Iteration Algorithm.ipynb
│ ├── T.npy
│ └── Value Iteration Algorithm.ipynb
├── classification_and_regression_trees/
│ ├── bikeSpeedVsIq_test.txt
│ ├── bikeSpeedVsIq_train.txt
│ ├── compare.py
│ ├── dot/
│ │ ├── ex0.dot
│ │ ├── ex00.dot
│ │ ├── ex2.dot
│ │ ├── ex2_prune.dot
│ │ └── exp2.dot
│ ├── ex0.txt
│ ├── ex00.txt
│ ├── ex2.dot
│ ├── ex2.txt
│ ├── ex2test.txt
│ ├── exp.txt
│ ├── exp2.dot
│ ├── exp2.txt
│ ├── model_tree.py
│ ├── notebook/
│ │ ├── 分段函数回归树.ipynb
│ │ ├── 后剪枝.ipynb
│ │ └── 模型树对分段线性函数进行回归.ipynb
│ ├── prune.py
│ └── regression_tree.py
├── decision_tree/
│ ├── english_big.txt
│ ├── lenses.dot
│ ├── lenses.py
│ ├── lenses.txt
│ ├── sms_tree.dot
│ ├── sms_tree.pkl
│ ├── sms_tree.py
│ ├── sms_tree_2.dot
│ └── trees.py
├── linear_regression/
│ ├── abalone.txt
│ ├── ex0.txt
│ ├── ex1.txt
│ ├── lasso_regression.ipynb
│ ├── lasso_regression.py
│ ├── lasso_traj.ipynb
│ ├── lasso_ws
│ ├── local_weighted_linear_regression.py
│ ├── ridge_regression.ipynb
│ ├── ridge_regression.py
│ ├── stage_wise_regression.py
│ ├── stage_wise_traj.ipynb
│ └── standard_linear_regression.py
├── logistic_regression/
│ ├── english_big.txt
│ ├── logreg_grad_ascent.py
│ ├── logreg_stoch_grad_ascent.py
│ ├── sms.py
│ └── testSet.txt
├── naive_bayes/
│ ├── bayes.py
│ ├── english_big.txt
│ └── sms.py
└── support_vector_machine/
├── best_fit.py
├── svm_ga.py
├── svm_platt_smo.py
├── svm_simple_smo.py
└── testSet.txt
SYMBOL INDEX (82 symbols across 19 files)
FILE: classification_and_regression_trees/compare.py
function get_corrcoef (line 7) | def get_corrcoef(X, Y):
FILE: classification_and_regression_trees/model_tree.py
function linear_regression (line 6) | def linear_regression(dataset):
function fleaf (line 21) | def fleaf(dataset):
function ferr (line 27) | def ferr(dataset):
function get_nodes_edges (line 34) | def get_nodes_edges(tree, root_node=None):
function dotify (line 67) | def dotify(tree):
function tree_predict (line 83) | def tree_predict(data, tree):
FILE: classification_and_regression_trees/prune.py
function not_tree (line 6) | def not_tree(tree):
function collapse (line 11) | def collapse(tree):
function postprune (line 19) | def postprune(tree, test_data):
FILE: classification_and_regression_trees/regression_tree.py
function load_data (line 14) | def load_data(filename):
function split_dataset (line 24) | def split_dataset(dataset, feat_idx, value):
function create_tree (line 35) | def create_tree(dataset, fleaf, ferr, opt=None):
function fleaf (line 67) | def fleaf(dataset):
function ferr (line 73) | def ferr(dataset):
function choose_best_feature (line 80) | def choose_best_feature(dataset, fleaf, ferr, opt):
function get_nodes_edges (line 126) | def get_nodes_edges(tree, root_node=None):
function dotify (line 159) | def dotify(tree):
function tree_predict (line 175) | def tree_predict(data, tree):
FILE: decision_tree/sms_tree.py
function get_doc_vector (line 19) | def get_doc_vector(words, vocabulary):
function parse_line (line 40) | def parse_line(line):
function parse_file (line 48) | def parse_file(filename):
FILE: decision_tree/trees.py
class DecisionTreeClassifier (line 13) | class DecisionTreeClassifier(object):
method split_dataset (line 18) | def split_dataset(dataset, classes, feat_idx):
method get_shanno_entropy (line 36) | def get_shanno_entropy(self, values):
method choose_best_split_feature (line 45) | def choose_best_split_feature(self, dataset, classes):
method get_majority (line 67) | def get_majority(classes):
method create_tree (line 76) | def create_tree(self, dataset, classes, feat_names):
method get_nodes_edges (line 113) | def get_nodes_edges(self, tree=None, root_node=None):
method dotify (line 146) | def dotify(self, tree=None):
method classify (line 165) | def classify(self, data_vect, feat_names=None, tree=None):
method dump_tree (line 184) | def dump_tree(self, filename, tree=None):
method load_tree (line 193) | def load_tree(self, filename):
FILE: linear_regression/lasso_regression.py
function lasso_regression (line 13) | def lasso_regression(X, y, lambd=0.2, threshold=0.1):
function lasso_traj (line 54) | def lasso_traj(X, y, ntest=30):
FILE: linear_regression/local_weighted_linear_regression.py
function lwlr (line 11) | def lwlr(x, X, Y, k):
FILE: linear_regression/ridge_regression.py
function ridge_regression (line 11) | def ridge_regression(X, y, lambd=0.2):
function ridge_traj (line 20) | def ridge_traj(X, y, ntest=30):
FILE: linear_regression/stage_wise_regression.py
function stagewise_regression (line 10) | def stagewise_regression(X, y, eps=0.01, niter=100):
FILE: linear_regression/standard_linear_regression.py
function load_data (line 8) | def load_data(filename):
function standarize (line 21) | def standarize(X):
function std_linreg (line 28) | def std_linreg(X, Y):
function get_corrcoef (line 35) | def get_corrcoef(X, Y):
FILE: logistic_regression/logreg_grad_ascent.py
class LogisticRegressionClassifier (line 10) | class LogisticRegressionClassifier(object):
method sigmoid (line 15) | def sigmoid(x):
method gradient_ascent (line 20) | def gradient_ascent(self, dataset, labels, max_iter=10000):
method classify (line 44) | def classify(self, data, w=None):
function load_data (line 54) | def load_data(filename):
function snapshot (line 67) | def snapshot(w, dataset, labels, pic_name):
FILE: logistic_regression/logreg_stoch_grad_ascent.py
class LogisticRegressionClassifier (line 12) | class LogisticRegressionClassifier(BaseClassifer):
method stoch_gradient_ascent (line 14) | def stoch_gradient_ascent(self, dataset, labels, max_iter=150):
FILE: logistic_regression/sms.py
function get_doc_vector (line 18) | def get_doc_vector(words, vocabulary):
function parse_line (line 39) | def parse_line(line):
function parse_file (line 47) | def parse_file(filename):
FILE: naive_bayes/bayes.py
class NaiveBayesClassifier (line 8) | class NaiveBayesClassifier(object):
method train (line 12) | def train(self, dataset, classes):
method classify (line 49) | def classify(self, doc_vect, cond_probs, cls_probs):
FILE: naive_bayes/sms.py
function get_doc_vector (line 18) | def get_doc_vector(words, vocabulary):
function parse_line (line 39) | def parse_line(line):
function parse_file (line 47) | def parse_file(filename):
FILE: support_vector_machine/svm_ga.py
function load_data (line 22) | def load_data(filename):
function get_w (line 31) | def get_w(alphas, dataset, labels):
function fitness (line 59) | def fitness(indv):
FILE: support_vector_machine/svm_platt_smo.py
class SVMUtil (line 10) | class SVMUtil(object):
method __init__ (line 14) | def __init__(self, dataset, labels, C, tolerance=0.001):
method f (line 24) | def f(self, x):
method get_error (line 38) | def get_error(self, i):
method update_errors (line 45) | def update_errors(self):
method meet_kkt (line 50) | def meet_kkt(self, i):
function load_data (line 61) | def load_data(filename):
function clip (line 70) | def clip(alpha, L, H):
function select_j_rand (line 80) | def select_j_rand(i, m):
function select_j (line 87) | def select_j(i, svm_util):
function get_w (line 107) | def get_w(alphas, dataset, labels):
function take_step (line 116) | def take_step(i, j, svm_util):
function examine_example (line 169) | def examine_example(i, svm_util):
function platt_smo (line 184) | def platt_smo(dataset, labels, C, max_iter):
FILE: support_vector_machine/svm_simple_smo.py
function load_data (line 10) | def load_data(filename):
function clip (line 19) | def clip(alpha, L, H):
function select_j (line 29) | def select_j(i, m):
function get_w (line 36) | def get_w(alphas, dataset, labels):
function simple_smo (line 45) | def simple_smo(dataset, labels, C, max_iter):
Condensed preview — 65 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (922K chars).
[
{
"path": ".gitignore",
"chars": 1175,
"preview": "# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packagi"
},
{
"path": "README.md",
"chars": 2288,
"preview": "# MLBox\nMachine Learning Algorithms implementations\n\n# Blogs\n- [机器学习算法实践-决策树(Decision Tree)](http://pytlab.github.io/201"
},
{
"path": "Reinforcement Learning/Calculating State Utilities.ipynb",
"chars": 3685,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {},\n \"source\": [\n \"A MDP is a reinterpretation of Mark"
},
{
"path": "Reinforcement Learning/Calculating Transition Probabilities.ipynb",
"chars": 2535,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {},\n \"source\": [\n \" 1. Set of possible states : S = {"
},
{
"path": "Reinforcement Learning/Defining Initial Distribution.ipynb",
"chars": 2926,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {},\n \"source\": [\n \"Let us now define the initial distr"
},
{
"path": "Reinforcement Learning/Policy Iteration Algorithm.ipynb",
"chars": 8857,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {},\n \"source\": [\n \"Policy iteration is guaranteed to c"
},
{
"path": "Reinforcement Learning/Value Iteration Algorithm.ipynb",
"chars": 4695,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {},\n \"source\": [\n \"Value Iteration algorithm uses the "
},
{
"path": "classification_and_regression_trees/bikeSpeedVsIq_test.txt",
"chars": 4235,
"preview": "12.000000\t121.010516\r\n19.000000\t157.337044\r\n12.000000\t116.031825\r\n15.000000\t132.124872\r\n2.000000\t52.719612\r\n6.000000\t39."
},
{
"path": "classification_and_regression_trees/bikeSpeedVsIq_train.txt",
"chars": 4220,
"preview": "3.000000\t46.852122\r\n23.000000\t178.676107\r\n0.000000\t86.154024\r\n6.000000\t68.707614\r\n15.000000\t139.737693\r\n17.000000\t141.98"
},
{
"path": "classification_and_regression_trees/compare.py",
"chars": 1556,
"preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nfrom regression_tree import *\nfrom model_tree import linear_regression\n\nd"
},
{
"path": "classification_and_regression_trees/dot/ex0.dot",
"chars": 1273,
"preview": "digraph decision_tree {\n \"5db27cbb-29af-4987-9cd2-9217c781000d\" [label=\"0: 0.400158\"];\n \"a81daf61-ab07-4e65-8b8a-5"
},
{
"path": "classification_and_regression_trees/dot/ex00.dot",
"chars": 381,
"preview": "digraph decision_tree {\n \"ccd352d8-dbf6-4f59-ae0b-c983f39e5c87\" [label=\"0: 0.50794\"];\n \"46052817-27f4-4748-8f02-43"
},
{
"path": "classification_and_regression_trees/dot/ex2.dot",
"chars": 12311,
"preview": "digraph decision_tree {\n \"bdbe6f68-a446-4539-8a80-860f22663afe\" [label=\"0: 0.508542\"];\n \"acef94b2-b18f-4c9c-bb21-4"
},
{
"path": "classification_and_regression_trees/dot/ex2_prune.dot",
"chars": 9923,
"preview": "digraph decision_tree {\n \"c4bff19d-b75d-4b50-99e8-34f696a77644\" [label=\"0: 0.508542\"];\n \"68b83894-3568-462c-a8c2-a"
},
{
"path": "classification_and_regression_trees/dot/exp2.dot",
"chars": 456,
"preview": "digraph decision_tree {\n \"5c49cf77-b404-459e-b4fd-513a927807dc\" [label=\"0: 0.304401\"];\n \"83d1a5dd-ca47-4f50-845b-3"
},
{
"path": "classification_and_regression_trees/ex0.txt",
"chars": 3821,
"preview": "0.409175\t1.883180\r\n0.182603\t0.063908\r\n0.663687\t3.042257\r\n0.517395\t2.305004\r\n0.013643\t-0.067698\r\n0.469643\t1.662809\r\n0.725"
},
{
"path": "classification_and_regression_trees/ex00.txt",
"chars": 3846,
"preview": "0.036098\t0.155096\r\n0.993349\t1.077553\r\n0.530897\t0.893462\r\n0.712386\t0.564858\r\n0.343554\t-0.371700\r\n0.098016\t-0.332760\r\n0.69"
},
{
"path": "classification_and_regression_trees/ex2.dot",
"chars": 682,
"preview": "digraph decision_tree {\n \"e1b05249-eb8e-4afd-837c-d2f5a5299a6a\" [label=\"0: 0.508542\"];\n \"b82d5e44-41de-40ec-8558-f"
},
{
"path": "classification_and_regression_trees/ex2.txt",
"chars": 4069,
"preview": "0.228628\t-2.266273\r\n0.965969\t112.386764\r\n0.342761\t-31.584855\r\n0.901444\t87.300625\r\n0.585413\t125.295113\r\n0.334900\t18.97665"
},
{
"path": "classification_and_regression_trees/ex2test.txt",
"chars": 4064,
"preview": "0.421862\t10.830241\r\n0.105349\t-2.241611\r\n0.155196\t21.872976\r\n0.161152\t2.015418\r\n0.382632\t-38.778979\r\n0.017710\t20.109113\r\n"
},
{
"path": "classification_and_regression_trees/exp.txt",
"chars": 3998,
"preview": "0.529582\t100.737303\r\n0.985730\t103.106872\r\n0.797869\t99.666151\r\n0.393473\t-1.773056\r\n0.272568\t-1.170222\r\n0.758825\t96.752440"
},
{
"path": "classification_and_regression_trees/exp2.dot",
"chars": 456,
"preview": "digraph decision_tree {\n \"c830d5ff-5d25-4637-a268-2bb63f5d4351\" [label=\"0: 0.304401\"];\n \"44889deb-3d44-405b-a7cf-d"
},
{
"path": "classification_and_regression_trees/exp2.txt",
"chars": 3831,
"preview": "0.070670\t3.470829\r\n0.534076\t6.377132\r\n0.747221\t8.949407\r\n0.668970\t8.034081\r\n0.586082\t6.997721\r\n0.764962\t9.318110\r\n0.6581"
},
{
"path": "classification_and_regression_trees/model_tree.py",
"chars": 2904,
"preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nfrom regression_tree import *\n\ndef linear_regression(dataset):\n ''' 获取"
},
{
"path": "classification_and_regression_trees/notebook/分段函数回归树.ipynb",
"chars": 22802,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 10,\n \"metadata\": {\n \"scrolled\": true\n },\n \"outp"
},
{
"path": "classification_and_regression_trees/notebook/后剪枝.ipynb",
"chars": 14241,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [],\n \"source\": [\n "
},
{
"path": "classification_and_regression_trees/notebook/模型树对分段线性函数进行回归.ipynb",
"chars": 23208,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [],\n \"source\": [\n "
},
{
"path": "classification_and_regression_trees/prune.py",
"chars": 1237,
"preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nfrom regression_tree import *\n\ndef not_tree(tree):\n ''' 判断是否不是一棵树结构\n "
},
{
"path": "classification_and_regression_trees/regression_tree.py",
"chars": 5388,
"preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\n''' 回归树实现\n'''\n\nimport uuid\nfrom functools import namedtuple\n\nimport numpy"
},
{
"path": "decision_tree/english_big.txt",
"chars": 15139,
"preview": "Urgent! call 09061749602 from Landline. Your complimentary 4* Tenerife Holiday or 10,000 cash await collection SAE T&Cs "
},
{
"path": "decision_tree/lenses.dot",
"chars": 2380,
"preview": "digraph decision_tree {\n \"99d3b650-7557-420c-be5f-037403909eef\" [label=\"tearRate\"];\n \"ccf5c62e-14ca-4cef-9525-4b8f"
},
{
"path": "decision_tree/lenses.py",
"chars": 388,
"preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nfrom trees import DecisionTreeClassifier\n\nlense_labels = ['age', 'prescri"
},
{
"path": "decision_tree/lenses.txt",
"chars": 795,
"preview": "young\tmyope\tno\treduced\tno lenses\r\nyoung\tmyope\tno\tnormal\tsoft\r\nyoung\tmyope\tyes\treduced\tno lenses\r\nyoung\tmyope\tyes\tnormal\t"
},
{
"path": "decision_tree/sms_tree.dot",
"chars": 2289,
"preview": "digraph decision_tree {\n \"959b4c0c-1821-446d-94a1-c619c2decfcd\" [label=\"call\"];\n \"18665160-b058-437f-9b2e-05df2eb5"
},
{
"path": "decision_tree/sms_tree.py",
"chars": 2619,
"preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\n''' 通过垃圾短信数据训练朴素贝叶斯模型,并进行留存交叉验证\n'''\n\nimport re\nimport random\nimport os\n\ni"
},
{
"path": "decision_tree/sms_tree_2.dot",
"chars": 2281,
"preview": "digraph decision_tree {\n \"8fbb40df-9b8c-4525-a34a-0ec254360649\" [label=\"call\"];\n \"9c2cf1a6-e34b-4f3c-9cc0-17a4e20f"
},
{
"path": "decision_tree/trees.py",
"chars": 5930,
"preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n# Author: PytLab <shaozhengjiang@gmail.com>\n# Date: 2017-07-07\n\nimport cop"
},
{
"path": "linear_regression/abalone.txt",
"chars": 197357,
"preview": "1\t0.455\t0.365\t0.095\t0.514\t0.2245\t0.101\t0.15\t15\r\n1\t0.35\t0.265\t0.09\t0.2255\t0.0995\t0.0485\t0.07\t7\r\n-1\t0.53\t0.42\t0.135\t0.677\t"
},
{
"path": "linear_regression/ex0.txt",
"chars": 5600,
"preview": "1.000000\t0.067732\t3.176513\r\n1.000000\t0.427810\t3.816464\r\n1.000000\t0.995731\t4.550095\r\n1.000000\t0.738336\t4.256571\r\n1.000000"
},
{
"path": "linear_regression/ex1.txt",
"chars": 5600,
"preview": "1.000000\t0.635975\t4.093119\r\n1.000000\t0.552438\t3.804358\r\n1.000000\t0.855922\t4.456531\r\n1.000000\t0.083386\t3.187049\r\n1.000000"
},
{
"path": "linear_regression/lasso_regression.ipynb",
"chars": 8874,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [],\n \"source\": [\n "
},
{
"path": "linear_regression/lasso_regression.py",
"chars": 2085,
"preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nimport itertools\nfrom math import exp\n\nimport numpy as np\nimport matplotl"
},
{
"path": "linear_regression/lasso_traj.ipynb",
"chars": 20630,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [],\n \"source\": [\n "
},
{
"path": "linear_regression/local_weighted_linear_regression.py",
"chars": 1265,
"preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nfrom math import exp\n\nimport numpy as np\nimport matplotlib.pyplot as plt\n"
},
{
"path": "linear_regression/ridge_regression.ipynb",
"chars": 19902,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [],\n \"source\": [\n "
},
{
"path": "linear_regression/ridge_regression.py",
"chars": 1689,
"preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nfrom math import exp\n\nimport numpy as np\nimport matplotlib.pyplot as plt\n"
},
{
"path": "linear_regression/stage_wise_regression.py",
"chars": 1486,
"preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nimport numpy as np\nimport matplotlib.pyplot as plt\n\nfrom standard_linear_"
},
{
"path": "linear_regression/stage_wise_traj.ipynb",
"chars": 101517,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [],\n \"source\": [\n "
},
{
"path": "linear_regression/standard_linear_regression.py",
"chars": 1625,
"preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nimport numpy as np\nimport matplotlib.pyplot as plt\n\n\ndef load_data(filena"
},
{
"path": "logistic_regression/english_big.txt",
"chars": 114216,
"preview": "Urgent! call 09061749602 from Landline. Your complimentary 4* Tenerife Holiday or 10,000 cash await collection SAE T&Cs "
},
{
"path": "logistic_regression/logreg_grad_ascent.py",
"chars": 2980,
"preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nimport os\nfrom math import exp\n\nimport numpy as np\nimport matplotlib.pypl"
},
{
"path": "logistic_regression/logreg_stoch_grad_ascent.py",
"chars": 1629,
"preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nimport random\n\nimport numpy as np\nimport matplotlib.pyplot as plt\n\nfrom l"
},
{
"path": "logistic_regression/sms.py",
"chars": 2641,
"preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\n''' 通过垃圾短信数据训练Logistic回归模型,并进行留存交叉验证\n'''\n\nimport re\nimport random\n\nimport"
},
{
"path": "logistic_regression/testSet.txt",
"chars": 2187,
"preview": "-0.017612\t14.053064\t0\r\n-1.395634\t4.662541\t1\r\n-0.752157\t6.538620\t0\r\n-1.322371\t7.152853\t0\r\n0.423363\t11.054677\t0\r\n0.406704\t"
},
{
"path": "naive_bayes/bayes.py",
"chars": 1627,
"preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nfrom collections import defaultdict\n\nimport numpy as np\n\nclass NaiveBayes"
},
{
"path": "naive_bayes/english_big.txt",
"chars": 114216,
"preview": "Urgent! call 09061749602 from Landline. Your complimentary 4* Tenerife Holiday or 10,000 cash await collection SAE T&Cs "
},
{
"path": "naive_bayes/sms.py",
"chars": 2776,
"preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\n''' 通过垃圾短信数据训练朴素贝叶斯模型,并进行留存交叉验证\n'''\n\nimport re\nimport random\n\nimport nump"
},
{
"path": "support_vector_machine/best_fit.py",
"chars": 28091,
"preview": "best_fit = [\n (0, [0.9643380556912553, -0.14557889594528595, -5.0], 0.4416388939912057),\n (1, [0.8451392281387395,"
},
{
"path": "support_vector_machine/svm_ga.py",
"chars": 2669,
"preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\n''' 使用遗传算法框架GAFT优化SVM.\n\nGAFT项目地址: https://github.com/PytLab/gaft\n'''\n\nimp"
},
{
"path": "support_vector_machine/svm_platt_smo.py",
"chars": 7053,
"preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nimport random\n\nimport numpy as np\nimport matplotlib.pyplot as plt\n\n\nclass"
},
{
"path": "support_vector_machine/svm_simple_smo.py",
"chars": 4699,
"preview": "#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nimport random\n\nimport numpy as np\nimport matplotlib.pyplot as plt\n\n\ndef l"
},
{
"path": "support_vector_machine/testSet.txt",
"chars": 2208,
"preview": "3.542485\t1.977398\t-1\r\n3.018896\t2.556416\t-1\r\n7.551510\t-1.580030\t1\r\n2.114999\t-0.004466\t-1\r\n8.127113\t1.274372\t1\r\n7.108772\t-"
}
]
// ... and 3 more files (download for full content)
About this extraction
This page contains the full source code of the PytLab/MLBox GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 65 files (821.8 KB), approximately 423.5k tokens, and a symbol index with 82 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.