Full Code of Avik-Jain/100-Days-Of-ML-Code for AI

master 5d67810c1498 cached

24 files

55.9 KB

18.8k tokens

1 requests

Download .txt

Repository: Avik-Jain/100-Days-Of-ML-Code
Branch: master
Commit: 5d67810c1498
Files: 24
Total size: 55.9 KB

Directory structure:
gitextract_ol9wolcm/

├── .gitattributes
├── .github/
│   └── ISSUE_TEMPLATE/
│       ├── bug_report.md
│       └── feature_request.md
├── .gitignore
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── Code/
│   ├── Day 11 K-NN.md
│   ├── Day 13 SVM.md
│   ├── Day 1_Data PreProcessing.md
│   ├── Day 25 Decision Tree.md
│   ├── Day 34 Random_Forest.md
│   ├── Day 6 Logistic Regression.md
│   ├── Day2_Simple_Linear_Regression.md
│   └── Day3_Multiple_Linear_Regression.md
├── Info-graphs/
│   └── readme.md
├── LICENSE
├── Other Docs/
│   └── readme.md
├── README.md
├── _config.yml
└── datasets/
    ├── 50_Startups.csv
    ├── Data.csv
    ├── Social_Network_Ads.csv
    ├── readme.md
    └── studentscores.csv

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitattributes
================================================
*.pxd		text diff=python
*.py 		text diff=python
*.py3 		text diff=python
*.pyw 		text diff=python
*.pyx  		text diff=python


================================================
FILE: .github/ISSUE_TEMPLATE/bug_report.md
================================================
---
name: Bug report
about: Create a report to help us improve

---

**Describe the bug**
A clear and concise description of what the bug is.

**To Reproduce**
Steps to reproduce the behavior:
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error

**Expected behavior**
A clear and concise description of what you expected to happen.

**Screenshots**
If applicable, add screenshots to help explain your problem.

**Desktop (please complete the following information):**
 - OS: [e.g. iOS]
 - Browser [e.g. chrome, safari]
 - Version [e.g. 22]

**Smartphone (please complete the following information):**
 - Device: [e.g. iPhone6]
 - OS: [e.g. iOS8.1]
 - Browser [e.g. stock browser, safari]
 - Version [e.g. 22]

**Additional context**
Add any other context about the problem here.


================================================
FILE: .github/ISSUE_TEMPLATE/feature_request.md
================================================
---
name: Feature request
about: Suggest an idea for this project

---

**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

**Describe the solution you'd like**
A clear and concise description of what you want to happen.

**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.

**Additional context**
Add any other context or screenshots about the feature request here.


================================================
FILE: .gitignore
================================================
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# pyenv
.python-version

# celery beat schedule file
celerybeat-schedule

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/


================================================
FILE: CODE_OF_CONDUCT.md
================================================
# Contributor Covenant Code of Conduct

## Our Pledge

In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, nationality, personal appearance, race, religion, or sexual identity and orientation.

## Our Standards

Examples of behavior that contributes to creating a positive environment include:

* Using welcoming and inclusive language
* Being respectful of differing viewpoints and experiences
* Gracefully accepting constructive criticism
* Focusing on what is best for the community
* Showing empathy towards other community members

Examples of unacceptable behavior by participants include:

* The use of sexualized language or imagery and unwelcome sexual attention or advances
* Trolling, insulting/derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or electronic address, without explicit permission
* Other conduct which could reasonably be considered inappropriate in a professional setting

## Our Responsibilities

Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior.

Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.

## Scope

This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers.

## Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team at avikjain02@gmail.com. The project team will review and investigate all complaints, and will respond in a way that it deems appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately.

Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership.

## Attribution

This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, available at [http://contributor-covenant.org/version/1/4][version]

[homepage]: http://contributor-covenant.org
[version]: http://contributor-covenant.org/version/1/4/


================================================
FILE: CONTRIBUTING.md
================================================
## Contributing
When contributing to this repository, please first discuss the change you wish to make via issue, email, or any other method with the owners of this repository before making a change.

Please note we have a code of conduct, please follow it in all your interactions with the project.


================================================
FILE: Code/Day 11 K-NN.md
================================================
# K-Nearest Neighbors (K-NN)

<p align="center">
  <img src="https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Info-graphs/Day%207.jpg">
</p>

## The DataSet | Social Network 

<p align="center">
  <img src="https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Other%20Docs/data.PNG">
</p> 


## Importing the libraries
```python
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
```

## Importing the dataset
```python
dataset = pd.read_csv('Social_Network_Ads.csv')
X = dataset.iloc[:, [2, 3]].values
y = dataset.iloc[:, 4].values
```

## Splitting the dataset into the Training set and Test set
```python
from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)
```
## Feature Scaling
```python
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
```
## Fitting K-NN to the Training set
```python
from sklearn.neighbors import KNeighborsClassifier
classifier = KNeighborsClassifier(n_neighbors = 5, metric = 'minkowski', p = 2)
classifier.fit(X_train, y_train)
```
## Predicting the Test set results
```python
y_pred = classifier.predict(X_test)
```

## Making the Confusion Matrix
```python
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
```


================================================
FILE: Code/Day 13 SVM.md
================================================
# Day 13 | Support Vector Machine (SVM)

## Importing the libraries
```python
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
```

## Importing the dataset
```python
dataset = pd.read_csv('Social_Network_Ads.csv')
X = dataset.iloc[:, [2, 3]].values
y = dataset.iloc[:, 4].values
```

## Splitting the dataset into the Training set and Test set
```python
from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)
```

## Feature Scaling
```python
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.fit_transform(X_test)
```

## Fitting SVM to the Training set
```python
from sklearn.svm import SVC
classifier = SVC(kernel = 'linear', random_state = 0)
classifier.fit(X_train, y_train)
```
## Predicting the Test set results
```python
y_pred = classifier.predict(X_test)
```

## Making the Confusion Matrix
```python
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
```

## Visualising the Training set results

```python
from matplotlib.colors import ListedColormap
X_set, y_set = X_train, y_train
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
                     np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
             alpha = 0.75, cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
    plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
                c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('SVM (Training set)')
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()
```
<p align="center">
  <img src="https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Other%20Docs/ets.png">
</p>

## Visualising the Test set results
```python
from matplotlib.colors import ListedColormap
X_set, y_set = X_test, y_test
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
                     np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
             alpha = 0.75, cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
    plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
                c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('SVM (Test set)')
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()
```
<p align="center">
  <img src="https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Other%20Docs/test.png">
</p>


================================================
FILE: Code/Day 1_Data PreProcessing.md
================================================
# Data PreProcessing
<p align="center">
  <img src="https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Info-graphs/Day%201.jpg">
</p>

As shown in the infograph we will break down data preprocessing in 6 essential steps.
Get the dataset from [here](https://github.com/Avik-Jain/100-Days-Of-ML-Code/tree/master/datasets) that is used in this example

## Step 1: Importing the libraries
```Python
import numpy as np
import pandas as pd
```
## Step 2: Importing dataset
```python
dataset = pd.read_csv('Data.csv')
X = dataset.iloc[ : , :-1].values
Y = dataset.iloc[ : , 3].values
```
## Step 3: Handling the missing data
```python
from sklearn.preprocessing import Imputer
imputer = Imputer(missing_values = "NaN", strategy = "mean", axis = 0)
imputer = imputer.fit(X[ : , 1:3])
X[ : , 1:3] = imputer.transform(X[ : , 1:3])
```
## Step 4: Encoding categorical data
```python
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X = LabelEncoder()
X[ : , 0] = labelencoder_X.fit_transform(X[ : , 0])
```
### Creating a dummy variable
```python
onehotencoder = OneHotEncoder(categorical_features = [0])
X = onehotencoder.fit_transform(X).toarray()
labelencoder_Y = LabelEncoder()
Y =  labelencoder_Y.fit_transform(Y)
```
## Step 5: Splitting the datasets into training sets and Test sets 
```python
from sklearn.cross_validation import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split( X , Y , test_size = 0.2, random_state = 0)
```

## Step 6: Feature Scaling
```python
from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
X_train = sc_X.fit_transform(X_train)
X_test = sc_X.fit_transform(X_test)
```
### Done :smile:


================================================
FILE: Code/Day 25 Decision Tree.md
================================================
# Decision Tree Classification
<p align="center">
  <img src="https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Info-graphs/Day%2023.jpg">
</p>

### Importing the libraries
```python
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
```

### Importing the dataset
```python
dataset = pd.read_csv('Social_Network_Ads.csv')
X = dataset.iloc[:, [2, 3]].values
y = dataset.iloc[:, 4].values
```
### Splitting the dataset into the Training set and Test set
```python
from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)
```

### Feature Scaling
```python
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
```
### Fitting Decision Tree Classification to the Training set
```python
from sklearn.tree import DecisionTreeClassifier
classifier = DecisionTreeClassifier(criterion = 'entropy', random_state = 0)
classifier.fit(X_train, y_train)
```
### Predicting the Test set results
```python
y_pred = classifier.predict(X_test)
```
### Making the Confusion Matrix
```python
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
```
### Visualising the Training set results
```python
from matplotlib.colors import ListedColormap
X_set, y_set = X_train, y_train
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
                     np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
             alpha = 0.75, cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
    plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
                c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('Decision Tree Classification (Training set)')
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()
```
### Visualising the Test set results
```python
from matplotlib.colors import ListedColormap
X_set, y_set = X_test, y_test
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
                     np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
             alpha = 0.75, cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
    plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
                c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('Decision Tree Classification (Test set)')
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()
```


================================================
FILE: Code/Day 34 Random_Forest.md
================================================
# Random Forests
<p align="center">
  <img src="https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Info-graphs/Day%2033.jpg">
</p>


### Importing the libraries
```python
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
```

### Importing the dataset
```python
dataset = pd.read_csv('Social_Network_Ads.csv')
X = dataset.iloc[:, [2, 3]].values
y = dataset.iloc[:, 4].values
```
### Splitting the dataset into the Training set and Test set
```python
from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)
```

### Feature Scaling
```python
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
```
### Fitting Random Forest to the Training set
```python
from sklearn.ensemble import RandomForestClassifier
classifier = RandomForestClassifier(n_estimators = 10, criterion = 'entropy', random_state = 0)
classifier.fit(X_train, y_train)
```
### Predicting the Test set results
```python
y_pred = classifier.predict(X_test)
```
### Making the Confusion Matrix
```python
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
```
### Visualising the Training set results
```python
from matplotlib.colors import ListedColormap
X_set, y_set = X_train, y_train
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
                     np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
             alpha = 0.75, cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
    plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
                c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('Random Forest Classification (Training set)')
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()
```
### Visualising the Test set results
```python
from matplotlib.colors import ListedColormap
X_set, y_set = X_test, y_test
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
                     np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
             alpha = 0.75, cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
    plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
                c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('Random Forest Classification (Test set)')
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()
```


================================================
FILE: Code/Day 6 Logistic Regression.md
================================================
# Logistic Regression


<p align="center">
  <img src="https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Info-graphs/Day%204.jpg">
</p>

## The DataSet | Social Network 

<p align="center">
  <img src="https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Other%20Docs/data.PNG">
</p> 

This dataset contains information of users in a social network. Those informations are the user id the gender the age and the estimated salary. A car company has just launched their brand new luxury SUV. And we're trying to see which of these users of the social network are going to buy this brand new SUV And the last column here tells If yes or no the user bought this SUV we are going to build a model that is going to predict if a user is going to buy or not the SUV based on two variables which are going to be the age and the estimated salary. So our matrix of feature is only going to be these two columns.
We want to find some correlations between the age and the estimated salary of a user and his decision to purchase yes or no the SUV.

## Step 1 | Data Pre-Processing

### Importing the Libraries

```python
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
```
### Importing the dataset

Get the dataset from [here](https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/datasets/Social_Network_Ads.csv)
```python
dataset = pd.read_csv('Social_Network_Ads.csv')
X = dataset.iloc[:, [2, 3]].values
y = dataset.iloc[:, 4].values
```

### Splitting the dataset into the Training set and Test set

```python
from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)
```

### Feature Scaling

```python
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
```
## Step 2 | Logistic Regression Model

The library for this job which is going to be the linear model library and it is called linear because the logistic regression is a linear classifier which means that here since we're in two dimensions, our two categories of users are going to be separated by a straight line. Then import the logistic regression class.
Next we will create a new object from this class which is going to be our classifier that we are going to fit on our training set.

### Fitting Logistic Regression to the Training set

```python
from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression()
classifier.fit(X_train, y_train)
```
## Step 3 | Predection

### Predicting the Test set results

```python
y_pred = classifier.predict(X_test)
```

## Step 4 | Evaluating The Predection

We predicted the test results and now we will evaluate if our logistic regression model learned and understood correctly.
So this confusion matrix is going to contain the correct predictions that our model made on the set as well as the incorrect predictions.

### Making the Confusion Matrix

```python
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
```

## Visualization

<p align="center">
  <img src="https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Other%20Docs/training.png">
</p> 

<p align="center">
  <img src="https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Other%20Docs/testing.png">
</p> 


================================================
FILE: Code/Day2_Simple_Linear_Regression.md
================================================
# Simple Linear Regression


<p align="center">
  <img src="https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Info-graphs/Day%202.jpg">
</p>


# Step 1: Data Preprocessing
```python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

dataset = pd.read_csv('studentscores.csv')
X = dataset.iloc[ : ,   : 1 ].values
Y = dataset.iloc[ : , 1 ].values

from sklearn.cross_validation import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split( X, Y, test_size = 1/4, random_state = 0) 
```

# Step 2: Fitting Simple Linear Regression Model to the training set
 ```python
 from sklearn.linear_model import LinearRegression
 regressor = LinearRegression()
 regressor = regressor.fit(X_train, Y_train)
 ```
 # Step 3: Predecting the Result
 ```python
 Y_pred = regressor.predict(X_test)
 ```
 
 # Step 4: Visualization 
 ## Visualising the Training results
 ```python
 plt.scatter(X_train , Y_train, color = 'red')
 plt.plot(X_train , regressor.predict(X_train), color ='blue')
 ```
 ## Visualizing the test results
 ```python
 plt.scatter(X_test , Y_test, color = 'red')
 plt.plot(X_test , regressor.predict(X_test), color ='blue')
 ``` 


================================================
FILE: Code/Day3_Multiple_Linear_Regression.md
================================================
# Multiple Linear Regression


<p align="center">
  <img src="https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Info-graphs/Day%203.jpg">
</p>


## Step 1: Data Preprocessing

### Importing the libraries
```python
import pandas as pd
import numpy as np
```
### Importing the dataset
```python
dataset = pd.read_csv('50_Startups.csv')
X = dataset.iloc[ : , :-1].values
Y = dataset.iloc[ : ,  4 ].values
```

### Encoding Categorical data
```python
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder = LabelEncoder()
X[: , 3] = labelencoder.fit_transform(X[ : , 3])
onehotencoder = OneHotEncoder(categorical_features = [3])
X = onehotencoder.fit_transform(X).toarray()
```

### Avoiding Dummy Variable Trap
```python
X = X[: , 1:]
```

### Splitting the dataset into the Training set and Test set
```python
from sklearn.cross_validation import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.2, random_state = 0)
```
## Step 2: Fitting Multiple Linear Regression to the Training set
```python
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, Y_train)
```

## Step 3: Predicting the Test set results
```python
y_pred = regressor.predict(X_test)
```


================================================
FILE: Info-graphs/readme.md
================================================
Each Day Infograph


================================================
FILE: LICENSE
================================================
MIT License

Copyright (c) 2018 Avik Jain

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


================================================
FILE: Other Docs/readme.md
================================================
Images for representation


================================================
FILE: README.md
================================================
# 100-Days-Of-ML-Code

100 Days of Machine Learning Coding as proposed by [Siraj Raval](https://github.com/llSourcell)

Get the datasets from [here](https://github.com/Avik-Jain/100-Days-Of-ML-Code/tree/master/datasets)

## Data PreProcessing | Day 1
Check out the code from [here](https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Code/Day%201_Data%20PreProcessing.md).

<p align="center">
  <img src="https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Info-graphs/Day%201.jpg">
</p>

## Simple Linear Regression | Day 2
Check out the code from [here](https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Code/Day2_Simple_Linear_Regression.md).

<p align="center">
  <img src="https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Info-graphs/Day%202.jpg">
</p>

## Multiple Linear Regression | Day 3
Check out the code from [here](https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Code/Day3_Multiple_Linear_Regression.md).

<p align="center">
  <img src="https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Info-graphs/Day%203.jpg">
</p>

## Logistic Regression | Day 4

<p align="center">
  <img src="https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Info-graphs/Day%204.jpg">
</p>

## Logistic Regression | Day 5
Moving forward into #100DaysOfMLCode today I dived into the deeper depth of what Logistic Regression actually is and what is the math involved behind it. Learned how cost function is calculated and then how to apply gradient descent algorithm to cost function to minimize the error in prediction.  
Due to less time I will now be posting an infographic on alternate days.
Also if someone wants to help me out in documentaion of code and already has some experince in the field and knows Markdown for github please contact me on LinkedIn :) .

## Implementing Logistic Regression | Day 6
Check out the Code [here](https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Code/Day%206%20Logistic%20Regression.md)

## K Nearest Neighbours | Day 7
<p align="center">
  <img src="https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Info-graphs/Day%207.jpg">
</p>

## Math Behind Logistic Regression | Day 8 

#100DaysOfMLCode To clear my insights on logistic regression I was searching on the internet for some resource or article and I came across this article (https://towardsdatascience.com/logistic-regression-detailed-overview-46c4da4303bc) by Saishruthi Swaminathan. 

It gives a detailed description of Logistic Regression. Do check it out.

## Support Vector Machines | Day 9
Got an intution on what SVM is and how it is used to solve Classification problem.

## SVM and KNN | Day 10
Learned more about how SVM works and implementing the K-NN algorithm.

## Implementation of K-NN | Day 11  

Implemented the K-NN algorithm for classification. #100DaysOfMLCode 
Support Vector Machine Infographic is halfway complete. Will update it tomorrow.

## Support Vector Machines | Day 12
<p align="center">
  <img src="https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Info-graphs/Day%2012.jpg">
</p>

## Naive Bayes Classifier | Day 13

Continuing with #100DaysOfMLCode today I went through the Naive Bayes classifier.
I am also implementing the SVM in python using scikit-learn. Will update the code soon.

## Implementation of SVM | Day 14
Today I implemented SVM on linearly related data. Used Scikit-Learn library. In Scikit-Learn we have SVC classifier which we use to achieve this task. Will be using kernel-trick on next implementation.
Check the code [here](https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Code/Day%2013%20SVM.md).

## Naive Bayes Classifier and Black Box Machine Learning | Day 15
Learned about different types of naive bayes classifiers. Also started the lectures by [Bloomberg](https://bloomberg.github.io/foml/#home). First one in the playlist was Black Box Machine Learning. It gives the whole overview about prediction functions, feature extraction, learning algorithms, performance evaluation, cross-validation, sample bias, nonstationarity, overfitting, and hyperparameter tuning.

## Implemented SVM using Kernel Trick | Day 16
Using Scikit-Learn library implemented SVM algorithm along with kernel function which maps our data points into higher dimension to find optimal hyperplane. 

## Started Deep learning Specialization on Coursera | Day 17
Completed the whole Week 1 and Week 2 on a single day. Learned Logistic regression as Neural Network. 

## Deep learning Specialization on Coursera | Day 18
Completed the Course 1 of the deep learning specialization. Implemented a neural net in python.

## The Learning Problem , Professor Yaser Abu-Mostafa | Day 19
Started Lecture 1 of 18 of Caltech's Machine Learning Course - CS 156 by Professor Yaser Abu-Mostafa. It was basically an introduction to the upcoming lectures. He also explained Perceptron Algorithm.

## Started Deep learning Specialization Course 2 | Day 20
Completed the Week 1 of Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization.

## Web Scraping | Day 21
Watched some tutorials on how to do web scraping using Beautiful Soup in order to collect data for building a model.

## Is Learning Feasible? | Day 22
Lecture 2 of 18 of Caltech's Machine Learning Course - CS 156 by Professor Yaser Abu-Mostafa. Learned about Hoeffding Inequality.

## Decision Trees | Day 23
<p align="center">
  <img src="https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Info-graphs/Day%2023.jpg">
</p>

## Introduction To Statistical Learning Theory | Day 24
Lec 3 of Bloomberg ML course introduced some of the core concepts like input space, action space, outcome space, prediction functions, loss functions, and hypothesis spaces.

## Implementing Decision Trees | Day 25
Check the code [here.](https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Code/Day%2025%20Decision%20Tree.md)

## Jumped To Brush up Linear Algebra | Day 26
Found an amazing [channel](https://www.youtube.com/channel/UCYO_jab_esuFRV4b17AJtAw) on youtube 3Blue1Brown. It has a playlist called Essence of Linear Algebra. Started off by completing 4 videos which gave a complete overview of Vectors, Linear Combinations, Spans, Basis Vectors, Linear Transformations and Matrix Multiplication. 

Link to the playlist [here.](https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab)

## Jumped To Brush up Linear Algebra | Day 27
Continuing with the playlist completed next 4 videos discussing topics 3D Transformations, Determinants, Inverse Matrix, Column Space, Null Space and Non-Square Matrices.

Link to the playlist [here.](https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab)

## Jumped To Brush up Linear Algebra | Day 28
In the playlist of 3Blue1Brown completed another 3 videos from the essence of linear algebra. 
Topics covered were Dot Product and Cross Product.

Link to the playlist [here.](https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab)


## Jumped To Brush up Linear Algebra | Day 29
Completed the whole playlist today, videos 12-14. Really an amazing playlist to refresh the concepts of Linear Algebra.
Topics covered were the change of basis, Eigenvectors and Eigenvalues, and Abstract Vector Spaces.

Link to the playlist [here.](https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab)

## Essence of calculus | Day 30
Completing the playlist - Essence of Linear Algebra by 3blue1brown a suggestion popped up by youtube regarding a series of videos again by the same channel 3Blue1Brown. Being already impressed by the previous series on Linear algebra I dived straight into it.
Completed about 5 videos on topics such as Derivatives, Chain Rule, Product Rule, and derivative of exponential.

Link to the playlist [here.](https://www.youtube.com/playlist?list=PLZHQObOWTQDMsr9K-rj53DwVRMYO3t5Yr)

## Essence of calculus | Day 31
Watched 2 Videos on topic Implicit Diffrentiation and Limits from the playlist Essence of Calculus.

Link to the playlist [here.](https://www.youtube.com/playlist?list=PLZHQObOWTQDMsr9K-rj53DwVRMYO3t5Yr)

## Essence of calculus | Day 32
Watched the remaining 4 videos covering topics Like Integration and Higher order derivatives.

Link to the playlist [here.](https://www.youtube.com/playlist?list=PLZHQObOWTQDMsr9K-rj53DwVRMYO3t5Yr)

## Random Forests | Day 33
<p align="center">
  <img src="https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Info-graphs/Day%2033.jpg">
</p>

## Implementing Random Forests | Day 34
Check the code [here.](https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Code/Day%2034%20Random_Forest.md)

## But what *is* a Neural Network? | Deep learning, chapter 1  | Day 35
An Amazing Video on neural networks by 3Blue1Brown youtube channel. This video gives a good understanding of Neural Networks and uses Handwritten digit dataset to explain the concept. 
Link To the [video.](https://www.youtube.com/watch?v=aircAruvnKk&t=7s)

## Gradient descent, how neural networks learn | Deep learning, chapter 2 | Day 36
Part two of neural networks by 3Blue1Brown youtube channel. This video explains the concepts of Gradient Descent in an interesting way. 169 must watch and highly recommended.
Link To the [video.](https://www.youtube.com/watch?v=IHZwWFHWa-w)

## What is backpropagation really doing? | Deep learning, chapter 3 | Day 37
Part three of neural networks by 3Blue1Brown youtube channel. This video mostly discusses the partial derivatives and backpropagation.
Link To the [video.](https://www.youtube.com/watch?v=Ilg3gGewQ5U)

## Backpropagation calculus | Deep learning, chapter 4 | Day 38
Part four of neural networks by 3Blue1Brown youtube channel. The goal here is to represent, in somewhat more formal terms, the intuition for how backpropagation works and the video moslty discusses the partial derivatives and backpropagation.
Link To the [video.](https://www.youtube.com/watch?v=tIeHLnjs5U8)

## Deep Learning with Python, TensorFlow, and Keras tutorial | Day 39
Link To the [video.](https://www.youtube.com/watch?v=wQ8BIBpya2k&t=19s&index=2&list=PLQVvvaa0QuDfhTox0AjmQ6tvTgMBZBEXN)

## Loading in your own data - Deep Learning basics with Python, TensorFlow and Keras p.2 | Day 40
Link To the [video.](https://www.youtube.com/watch?v=j-3vuBynnOE&list=PLQVvvaa0QuDfhTox0AjmQ6tvTgMBZBEXN&index=2)

## Convolutional Neural Networks - Deep Learning basics with Python, TensorFlow and Keras p.3 | Day 41
Link To the [video.](https://www.youtube.com/watch?v=WvoLTXIjBYU&list=PLQVvvaa0QuDfhTox0AjmQ6tvTgMBZBEXN&index=3)

## Analyzing Models with TensorBoard - Deep Learning with Python, TensorFlow and Keras p.4 | Day 42
Link To the [video.](https://www.youtube.com/watch?v=BqgTU7_cBnk&list=PLQVvvaa0QuDfhTox0AjmQ6tvTgMBZBEXN&index=4)

## K Means Clustering | Day 43
Moved to Unsupervised Learning and studied about Clustering.
Working on my website check it out [avikjain.me](http://www.avikjain.me/)
Also found a wonderful animation that can help to easily understand K - Means Clustering [Link](http://shabal.in/visuals/kmeans/6.html)

<p align="center">
  <img src="https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Info-graphs/Day%2043.jpg">
</p>

## K Means Clustering Implementation | Day 44
Implemented K Means Clustering. Check the code [here.]()

## Digging Deeper | NUMPY  | Day 45
Got a new book "Python Data Science HandBook" by JK VanderPlas Check the Jupyter notebooks [here.](https://github.com/jakevdp/PythonDataScienceHandbook)
<br>Started with chapter 2 : Introduction to Numpy. Covered topics like Data Types, Numpy arrays and Computations on Numpy arrays.
<br>Check the code - 
<br>[Introduction to NumPy](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/02.00-Introduction-to-NumPy.ipynb)
<br>[Understanding Data Types in Python](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/02.01-Understanding-Data-Types.ipynb)
<br>[The Basics of NumPy Arrays](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/02.02-The-Basics-Of-NumPy-Arrays.ipynb)
<br>[Computation on NumPy Arrays: Universal Functions](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/02.03-Computation-on-arrays-ufuncs.ipynb)

## Digging Deeper | NUMPY | Day 46
Chapter 2 : Aggregations, Comparisions and Broadcasting
<br>Link to Notebook:
<br>[Aggregations: Min, Max, and Everything In Between](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/02.04-Computation-on-arrays-aggregates.ipynb)
<br>[Computation on Arrays: Broadcasting](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/02.05-Computation-on-arrays-broadcasting.ipynb)
<br>[Comparisons, Masks, and Boolean Logic](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/02.06-Boolean-Arrays-and-Masks.ipynb)

## Digging Deeper | NUMPY | Day 47
Chapter 2 : Fancy Indexing, sorting arrays, Struchered Data
<br>Link to Notebook:
<br>[Fancy Indexing](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/02.07-Fancy-Indexing.ipynb)
<br>[Sorting Arrays](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/02.08-Sorting.ipynb)
<br>[Structured Data: NumPy's Structured Arrays](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/02.09-<br>Structured-Data-NumPy.ipynb)

## Digging Deeper | PANDAS | Day 48
Chapter 3 : Data Manipulation with Pandas
<br> Covered Various topics like Pandas Objects, Data Indexing and Selection, Operating on Data, Handling Missing Data, Hierarchical Indexing, ConCat and Append.
<br>Link To the Notebooks:
<br>[Data Manipulation with Pandas](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.00-Introduction-to-Pandas.ipynb)
<br>[Introducing Pandas Objects](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.01-Introducing-Pandas-Objects.ipynb)
<br>[Data Indexing and Selection](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.02-Data-Indexing-and-Selection.ipynb)
<br>[Operating on Data in Pandas](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.03-Operations-in-Pandas.ipynb)
<br>[Handling Missing Data](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.04-Missing-Values.ipynb)
<br>[Hierarchical Indexing](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.05-Hierarchical-Indexing.ipynb)
<br>[Combining Datasets: Concat and Append](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.06-Concat-And-Append.ipynb)

## Digging Deeper | PANDAS | Day 49
Chapter 3: Completed following topics- Merge and Join, Aggregation and grouping and Pivot Tables.
<br>[Combining Datasets: Merge and Join](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.07-Merge-and-Join.ipynb)
<br>[Aggregation and Grouping](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.08-Aggregation-and-Grouping.ipynb)
<br>[Pivot Tables](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.09-Pivot-Tables.ipynb)

## Digging Deeper | PANDAS | Day 50
Chapter 3: Vectorized Strings Operations, Working with Time Series
<br>Links to Notebooks:
<br>[Vectorized String Operations](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.10-Working-With-Strings.ipynb)
<br>[Working with Time Series](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.11-Working-with-Time-Series.ipynb)
<br>[High-Performance Pandas: eval() and query()](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.12-Performance-Eval-and-Query.ipynb)

## Digging Deeper | MATPLOTLIB | Day 51
Chapter 4: Visualization with Matplotlib 
Learned about Simple Line Plots, Simple Scatter Plotsand Density and Contour Plots.
<br>Links to Notebooks: 
<br>[Visualization with Matplotlib](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.00-Introduction-To-Matplotlib.ipynb)
<br>[Simple Line Plots](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.01-Simple-Line-Plots.ipynb)
<br>[Simple Scatter Plots](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.02-Simple-Scatter-Plots.ipynb)
<br>[Visualizing Errors](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.03-Errorbars.ipynb)
<br>[Density and Contour Plots](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.04-Density-and-Contour-Plots.ipynb)

## Digging Deeper | MATPLOTLIB | Day 52
Chapter 4: Visualization with Matplotlib 
Learned about Histograms, How to customize plot legends, colorbars, and buliding Multiple Subplots.
<br>Links to Notebooks: 
<br>[Histograms, Binnings, and Density](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.05-Histograms-and-Binnings.ipynb)
<br>[Customizing Plot Legends](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.06-Customizing-Legends.ipynb)
<br>[Customizing Colorbars](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.07-Customizing-Colorbars.ipynb)
<br>[Multiple Subplots](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.08-Multiple-Subplots.ipynb)
<br>[Text and Annotation](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.09-Text-and-Annotation.ipynb)

## Digging Deeper | MATPLOTLIB | Day 53
Chapter 4: Covered Three Dimensional Plotting in Mathplotlib.
<br>Links to Notebooks:
<br>[Three-Dimensional Plotting in Matplotlib](https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.12-Three-Dimensional-Plotting.ipynb)

## Hierarchical Clustering | Day 54
Studied about Hierarchical Clustering.
Check out this amazing [Visualization.](https://cdn-images-1.medium.com/max/800/1*ET8kCcPpr893vNZFs8j4xg.gif)
<p align="center">
  <img src="https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Info-graphs/Day%2054.jpg">
</p>


================================================
FILE: _config.yml
================================================
theme: jekyll-theme-merlot

================================================
FILE: datasets/50_Startups.csv
================================================
R&D Spend,Administration,Marketing Spend,State,Profit
165349.2,136897.8,471784.1,New York,192261.83
162597.7,151377.59,443898.53,California,191792.06
153441.51,101145.55,407934.54,Florida,191050.39
144372.41,118671.85,383199.62,New York,182901.99
142107.34,91391.77,366168.42,Florida,166187.94
131876.9,99814.71,362861.36,New York,156991.12
134615.46,147198.87,127716.82,California,156122.51
130298.13,145530.06,323876.68,Florida,155752.6
120542.52,148718.95,311613.29,New York,152211.77
123334.88,108679.17,304981.62,California,149759.96
101913.08,110594.11,229160.95,Florida,146121.95
100671.96,91790.61,249744.55,California,144259.4
93863.75,127320.38,249839.44,Florida,141585.52
91992.39,135495.07,252664.93,California,134307.35
119943.24,156547.42,256512.92,Florida,132602.65
114523.61,122616.84,261776.23,New York,129917.04
78013.11,121597.55,264346.06,California,126992.93
94657.16,145077.58,282574.31,New York,125370.37
91749.16,114175.79,294919.57,Florida,124266.9
86419.7,153514.11,0,New York,122776.86
76253.86,113867.3,298664.47,California,118474.03
78389.47,153773.43,299737.29,New York,111313.02
73994.56,122782.75,303319.26,Florida,110352.25
67532.53,105751.03,304768.73,Florida,108733.99
77044.01,99281.34,140574.81,New York,108552.04
64664.71,139553.16,137962.62,California,107404.34
75328.87,144135.98,134050.07,Florida,105733.54
72107.6,127864.55,353183.81,New York,105008.31
66051.52,182645.56,118148.2,Florida,103282.38
65605.48,153032.06,107138.38,New York,101004.64
61994.48,115641.28,91131.24,Florida,99937.59
61136.38,152701.92,88218.23,New York,97483.56
63408.86,129219.61,46085.25,California,97427.84
55493.95,103057.49,214634.81,Florida,96778.92
46426.07,157693.92,210797.67,California,96712.8
46014.02,85047.44,205517.64,New York,96479.51
28663.76,127056.21,201126.82,Florida,90708.19
44069.95,51283.14,197029.42,California,89949.14
20229.59,65947.93,185265.1,New York,81229.06
38558.51,82982.09,174999.3,California,81005.76
28754.33,118546.05,172795.67,California,78239.91
27892.92,84710.77,164470.71,Florida,77798.83
23640.93,96189.63,148001.11,California,71498.49
15505.73,127382.3,35534.17,New York,69758.98
22177.74,154806.14,28334.72,California,65200.33
1000.23,124153.04,1903.93,New York,64926.08
1315.46,115816.21,297114.46,Florida,49490.75
0,135426.92,0,California,42559.73
542.05,51743.15,0,New York,35673.41
0,116983.8,45173.06,California,14681.4

================================================
FILE: datasets/Data.csv
================================================
Country,Age,Salary,Purchased
France,44,72000,No
Spain,27,48000,Yes
Germany,30,54000,No
Spain,38,61000,No
Germany,40,,Yes
France,35,58000,Yes
Spain,,52000,No
France,48,79000,Yes
Germany,50,83000,No
France,37,67000,Yes

================================================
FILE: datasets/Social_Network_Ads.csv
================================================
User ID,Gender,Age,EstimatedSalary,Purchased
15624510,Male,19,19000,0
15810944,Male,35,20000,0
15668575,Female,26,43000,0
15603246,Female,27,57000,0
15804002,Male,19,76000,0
15728773,Male,27,58000,0
15598044,Female,27,84000,0
15694829,Female,32,150000,1
15600575,Male,25,33000,0
15727311,Female,35,65000,0
15570769,Female,26,80000,0
15606274,Female,26,52000,0
15746139,Male,20,86000,0
15704987,Male,32,18000,0
15628972,Male,18,82000,0
15697686,Male,29,80000,0
15733883,Male,47,25000,1
15617482,Male,45,26000,1
15704583,Male,46,28000,1
15621083,Female,48,29000,1
15649487,Male,45,22000,1
15736760,Female,47,49000,1
15714658,Male,48,41000,1
15599081,Female,45,22000,1
15705113,Male,46,23000,1
15631159,Male,47,20000,1
15792818,Male,49,28000,1
15633531,Female,47,30000,1
15744529,Male,29,43000,0
15669656,Male,31,18000,0
15581198,Male,31,74000,0
15729054,Female,27,137000,1
15573452,Female,21,16000,0
15776733,Female,28,44000,0
15724858,Male,27,90000,0
15713144,Male,35,27000,0
15690188,Female,33,28000,0
15689425,Male,30,49000,0
15671766,Female,26,72000,0
15782806,Female,27,31000,0
15764419,Female,27,17000,0
15591915,Female,33,51000,0
15772798,Male,35,108000,0
15792008,Male,30,15000,0
15715541,Female,28,84000,0
15639277,Male,23,20000,0
15798850,Male,25,79000,0
15776348,Female,27,54000,0
15727696,Male,30,135000,1
15793813,Female,31,89000,0
15694395,Female,24,32000,0
15764195,Female,18,44000,0
15744919,Female,29,83000,0
15671655,Female,35,23000,0
15654901,Female,27,58000,0
15649136,Female,24,55000,0
15775562,Female,23,48000,0
15807481,Male,28,79000,0
15642885,Male,22,18000,0
15789109,Female,32,117000,0
15814004,Male,27,20000,0
15673619,Male,25,87000,0
15595135,Female,23,66000,0
15583681,Male,32,120000,1
15605000,Female,59,83000,0
15718071,Male,24,58000,0
15679760,Male,24,19000,0
15654574,Female,23,82000,0
15577178,Female,22,63000,0
15595324,Female,31,68000,0
15756932,Male,25,80000,0
15726358,Female,24,27000,0
15595228,Female,20,23000,0
15782530,Female,33,113000,0
15592877,Male,32,18000,0
15651983,Male,34,112000,1
15746737,Male,18,52000,0
15774179,Female,22,27000,0
15667265,Female,28,87000,0
15655123,Female,26,17000,0
15595917,Male,30,80000,0
15668385,Male,39,42000,0
15709476,Male,20,49000,0
15711218,Male,35,88000,0
15798659,Female,30,62000,0
15663939,Female,31,118000,1
15694946,Male,24,55000,0
15631912,Female,28,85000,0
15768816,Male,26,81000,0
15682268,Male,35,50000,0
15684801,Male,22,81000,0
15636428,Female,30,116000,0
15809823,Male,26,15000,0
15699284,Female,29,28000,0
15786993,Female,29,83000,0
15709441,Female,35,44000,0
15710257,Female,35,25000,0
15582492,Male,28,123000,1
15575694,Male,35,73000,0
15756820,Female,28,37000,0
15766289,Male,27,88000,0
15593014,Male,28,59000,0
15584545,Female,32,86000,0
15675949,Female,33,149000,1
15672091,Female,19,21000,0
15801658,Male,21,72000,0
15706185,Female,26,35000,0
15789863,Male,27,89000,0
15720943,Male,26,86000,0
15697997,Female,38,80000,0
15665416,Female,39,71000,0
15660200,Female,37,71000,0
15619653,Male,38,61000,0
15773447,Male,37,55000,0
15739160,Male,42,80000,0
15689237,Male,40,57000,0
15679297,Male,35,75000,0
15591433,Male,36,52000,0
15642725,Male,40,59000,0
15701962,Male,41,59000,0
15811613,Female,36,75000,0
15741049,Male,37,72000,0
15724423,Female,40,75000,0
15574305,Male,35,53000,0
15678168,Female,41,51000,0
15697020,Female,39,61000,0
15610801,Male,42,65000,0
15745232,Male,26,32000,0
15722758,Male,30,17000,0
15792102,Female,26,84000,0
15675185,Male,31,58000,0
15801247,Male,33,31000,0
15725660,Male,30,87000,0
15638963,Female,21,68000,0
15800061,Female,28,55000,0
15578006,Male,23,63000,0
15668504,Female,20,82000,0
15687491,Male,30,107000,1
15610403,Female,28,59000,0
15741094,Male,19,25000,0
15807909,Male,19,85000,0
15666141,Female,18,68000,0
15617134,Male,35,59000,0
15783029,Male,30,89000,0
15622833,Female,34,25000,0
15746422,Female,24,89000,0
15750839,Female,27,96000,1
15749130,Female,41,30000,0
15779862,Male,29,61000,0
15767871,Male,20,74000,0
15679651,Female,26,15000,0
15576219,Male,41,45000,0
15699247,Male,31,76000,0
15619087,Female,36,50000,0
15605327,Male,40,47000,0
15610140,Female,31,15000,0
15791174,Male,46,59000,0
15602373,Male,29,75000,0
15762605,Male,26,30000,0
15598840,Female,32,135000,1
15744279,Male,32,100000,1
15670619,Male,25,90000,0
15599533,Female,37,33000,0
15757837,Male,35,38000,0
15697574,Female,33,69000,0
15578738,Female,18,86000,0
15762228,Female,22,55000,0
15614827,Female,35,71000,0
15789815,Male,29,148000,1
15579781,Female,29,47000,0
15587013,Male,21,88000,0
15570932,Male,34,115000,0
15794661,Female,26,118000,0
15581654,Female,34,43000,0
15644296,Female,34,72000,0
15614420,Female,23,28000,0
15609653,Female,35,47000,0
15594577,Male,25,22000,0
15584114,Male,24,23000,0
15673367,Female,31,34000,0
15685576,Male,26,16000,0
15774727,Female,31,71000,0
15694288,Female,32,117000,1
15603319,Male,33,43000,0
15759066,Female,33,60000,0
15814816,Male,31,66000,0
15724402,Female,20,82000,0
15571059,Female,33,41000,0
15674206,Male,35,72000,0
15715160,Male,28,32000,0
15730448,Male,24,84000,0
15662067,Female,19,26000,0
15779581,Male,29,43000,0
15662901,Male,19,70000,0
15689751,Male,28,89000,0
15667742,Male,34,43000,0
15738448,Female,30,79000,0
15680243,Female,20,36000,0
15745083,Male,26,80000,0
15708228,Male,35,22000,0
15628523,Male,35,39000,0
15708196,Male,49,74000,0
15735549,Female,39,134000,1
15809347,Female,41,71000,0
15660866,Female,58,101000,1
15766609,Female,47,47000,0
15654230,Female,55,130000,1
15794566,Female,52,114000,0
15800890,Female,40,142000,1
15697424,Female,46,22000,0
15724536,Female,48,96000,1
15735878,Male,52,150000,1
15707596,Female,59,42000,0
15657163,Male,35,58000,0
15622478,Male,47,43000,0
15779529,Female,60,108000,1
15636023,Male,49,65000,0
15582066,Male,40,78000,0
15666675,Female,46,96000,0
15732987,Male,59,143000,1
15789432,Female,41,80000,0
15663161,Male,35,91000,1
15694879,Male,37,144000,1
15593715,Male,60,102000,1
15575002,Female,35,60000,0
15622171,Male,37,53000,0
15795224,Female,36,126000,1
15685346,Male,56,133000,1
15691808,Female,40,72000,0
15721007,Female,42,80000,1
15794253,Female,35,147000,1
15694453,Male,39,42000,0
15813113,Male,40,107000,1
15614187,Male,49,86000,1
15619407,Female,38,112000,0
15646227,Male,46,79000,1
15660541,Male,40,57000,0
15753874,Female,37,80000,0
15617877,Female,46,82000,0
15772073,Female,53,143000,1
15701537,Male,42,149000,1
15736228,Male,38,59000,0
15780572,Female,50,88000,1
15769596,Female,56,104000,1
15586996,Female,41,72000,0
15722061,Female,51,146000,1
15638003,Female,35,50000,0
15775590,Female,57,122000,1
15730688,Male,41,52000,0
15753102,Female,35,97000,1
15810075,Female,44,39000,0
15723373,Male,37,52000,0
15795298,Female,48,134000,1
15584320,Female,37,146000,1
15724161,Female,50,44000,0
15750056,Female,52,90000,1
15609637,Female,41,72000,0
15794493,Male,40,57000,0
15569641,Female,58,95000,1
15815236,Female,45,131000,1
15811177,Female,35,77000,0
15680587,Male,36,144000,1
15672821,Female,55,125000,1
15767681,Female,35,72000,0
15600379,Male,48,90000,1
15801336,Female,42,108000,1
15721592,Male,40,75000,0
15581282,Male,37,74000,0
15746203,Female,47,144000,1
15583137,Male,40,61000,0
15680752,Female,43,133000,0
15688172,Female,59,76000,1
15791373,Male,60,42000,1
15589449,Male,39,106000,1
15692819,Female,57,26000,1
15727467,Male,57,74000,1
15734312,Male,38,71000,0
15764604,Male,49,88000,1
15613014,Female,52,38000,1
15759684,Female,50,36000,1
15609669,Female,59,88000,1
15685536,Male,35,61000,0
15750447,Male,37,70000,1
15663249,Female,52,21000,1
15638646,Male,48,141000,0
15734161,Female,37,93000,1
15631070,Female,37,62000,0
15761950,Female,48,138000,1
15649668,Male,41,79000,0
15713912,Female,37,78000,1
15586757,Male,39,134000,1
15596522,Male,49,89000,1
15625395,Male,55,39000,1
15760570,Male,37,77000,0
15566689,Female,35,57000,0
15725794,Female,36,63000,0
15673539,Male,42,73000,1
15705298,Female,43,112000,1
15675791,Male,45,79000,0
15747043,Male,46,117000,1
15736397,Female,58,38000,1
15678201,Male,48,74000,1
15720745,Female,37,137000,1
15637593,Male,37,79000,1
15598070,Female,40,60000,0
15787550,Male,42,54000,0
15603942,Female,51,134000,0
15733973,Female,47,113000,1
15596761,Male,36,125000,1
15652400,Female,38,50000,0
15717893,Female,42,70000,0
15622585,Male,39,96000,1
15733964,Female,38,50000,0
15753861,Female,49,141000,1
15747097,Female,39,79000,0
15594762,Female,39,75000,1
15667417,Female,54,104000,1
15684861,Male,35,55000,0
15742204,Male,45,32000,1
15623502,Male,36,60000,0
15774872,Female,52,138000,1
15611191,Female,53,82000,1
15674331,Male,41,52000,0
15619465,Female,48,30000,1
15575247,Female,48,131000,1
15695679,Female,41,60000,0
15713463,Male,41,72000,0
15785170,Female,42,75000,0
15796351,Male,36,118000,1
15639576,Female,47,107000,1
15693264,Male,38,51000,0
15589715,Female,48,119000,1
15769902,Male,42,65000,0
15587177,Male,40,65000,0
15814553,Male,57,60000,1
15601550,Female,36,54000,0
15664907,Male,58,144000,1
15612465,Male,35,79000,0
15810800,Female,38,55000,0
15665760,Male,39,122000,1
15588080,Female,53,104000,1
15776844,Male,35,75000,0
15717560,Female,38,65000,0
15629739,Female,47,51000,1
15729908,Male,47,105000,1
15716781,Female,41,63000,0
15646936,Male,53,72000,1
15768151,Female,54,108000,1
15579212,Male,39,77000,0
15721835,Male,38,61000,0
15800515,Female,38,113000,1
15591279,Male,37,75000,0
15587419,Female,42,90000,1
15750335,Female,37,57000,0
15699619,Male,36,99000,1
15606472,Male,60,34000,1
15778368,Male,54,70000,1
15671387,Female,41,72000,0
15573926,Male,40,71000,1
15709183,Male,42,54000,0
15577514,Male,43,129000,1
15778830,Female,53,34000,1
15768072,Female,47,50000,1
15768293,Female,42,79000,0
15654456,Male,42,104000,1
15807525,Female,59,29000,1
15574372,Female,58,47000,1
15671249,Male,46,88000,1
15779744,Male,38,71000,0
15624755,Female,54,26000,1
15611430,Female,60,46000,1
15774744,Male,60,83000,1
15629885,Female,39,73000,0
15708791,Male,59,130000,1
15793890,Female,37,80000,0
15646091,Female,46,32000,1
15596984,Female,46,74000,0
15800215,Female,42,53000,0
15577806,Male,41,87000,1
15749381,Female,58,23000,1
15683758,Male,42,64000,0
15670615,Male,48,33000,1
15715622,Female,44,139000,1
15707634,Male,49,28000,1
15806901,Female,57,33000,1
15775335,Male,56,60000,1
15724150,Female,49,39000,1
15627220,Male,39,71000,0
15672330,Male,47,34000,1
15668521,Female,48,35000,1
15807837,Male,48,33000,1
15592570,Male,47,23000,1
15748589,Female,45,45000,1
15635893,Male,60,42000,1
15757632,Female,39,59000,0
15691863,Female,46,41000,1
15706071,Male,51,23000,1
15654296,Female,50,20000,1
15755018,Male,36,33000,0
15594041,Female,49,36000,1

================================================
FILE: datasets/readme.md
================================================
Day wise Dataset Used in Code


================================================
FILE: datasets/studentscores.csv
================================================
Hours,Scores
2.5,21
5.1,47
3.2,27
8.5,75
3.5,30
1.5,20
9.2,88
5.5,60
8.3,81
2.7,25
7.7,85
5.9,62
4.5,41
3.3,42
1.1,17
8.9,95
2.5,30
1.9,24
6.1,67
7.4,69
2.7,30
4.8,54
3.8,35
6.9,76
7.8,86

Download .txt

gitextract_ol9wolcm/

├── .gitattributes
├── .github/
│   └── ISSUE_TEMPLATE/
│       ├── bug_report.md
│       └── feature_request.md
├── .gitignore
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── Code/
│   ├── Day 11 K-NN.md
│   ├── Day 13 SVM.md
│   ├── Day 1_Data PreProcessing.md
│   ├── Day 25 Decision Tree.md
│   ├── Day 34 Random_Forest.md
│   ├── Day 6 Logistic Regression.md
│   ├── Day2_Simple_Linear_Regression.md
│   └── Day3_Multiple_Linear_Regression.md
├── Info-graphs/
│   └── readme.md
├── LICENSE
├── Other Docs/
│   └── readme.md
├── README.md
├── _config.yml
└── datasets/
    ├── 50_Startups.csv
    ├── Data.csv
    ├── Social_Network_Ads.csv
    ├── readme.md
    └── studentscores.csv

Download .json

Condensed preview — 24 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (61K chars).

[
  {
    "path": ".gitattributes",
    "chars": 124,
    "preview": "*.pxd\t\ttext diff=python\n*.py \t\ttext diff=python\n*.py3 \t\ttext diff=python\n*.pyw \t\ttext diff=python\n*.pyx  \t\ttext diff=pyt"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/bug_report.md",
    "chars": 799,
    "preview": "---\nname: Bug report\nabout: Create a report to help us improve\n\n---\n\n**Describe the bug**\nA clear and concise descriptio"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/feature_request.md",
    "chars": 560,
    "preview": "---\nname: Feature request\nabout: Suggest an idea for this project\n\n---\n\n**Is your feature request related to a problem? "
  },
  {
    "path": ".gitignore",
    "chars": 1203,
    "preview": "# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packagi"
  },
  {
    "path": "CODE_OF_CONDUCT.md",
    "chars": 3217,
    "preview": "# Contributor Covenant Code of Conduct\n\n## Our Pledge\n\nIn the interest of fostering an open and welcoming environment, w"
  },
  {
    "path": "CONTRIBUTING.md",
    "chars": 300,
    "preview": "## Contributing\nWhen contributing to this repository, please first discuss the change you wish to make via issue, email,"
  },
  {
    "path": "Code/Day 11 K-NN.md",
    "chars": 1398,
    "preview": "# K-Nearest Neighbors (K-NN)\n\n<p align=\"center\">\n  <img src=\"https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/maste"
  },
  {
    "path": "Code/Day 13 SVM.md",
    "chars": 2997,
    "preview": "# Day 13 | Support Vector Machine (SVM)\n\n## Importing the libraries\n```python\nimport numpy as np\nimport matplotlib.pyplo"
  },
  {
    "path": "Code/Day 1_Data PreProcessing.md",
    "chars": 1686,
    "preview": "# Data PreProcessing\n<p align=\"center\">\n  <img src=\"https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Info-gr"
  },
  {
    "path": "Code/Day 25 Decision Tree.md",
    "chars": 2988,
    "preview": "# Decision Tree Classification\n<p align=\"center\">\n  <img src=\"https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/mast"
  },
  {
    "path": "Code/Day 34 Random_Forest.md",
    "chars": 2983,
    "preview": "# Random Forests\n<p align=\"center\">\n  <img src=\"https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Info-graphs"
  },
  {
    "path": "Code/Day 6 Logistic Regression.md",
    "chars": 3365,
    "preview": "# Logistic Regression\n\n\n<p align=\"center\">\n  <img src=\"https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Info"
  },
  {
    "path": "Code/Day2_Simple_Linear_Regression.md",
    "chars": 1178,
    "preview": "# Simple Linear Regression\n\n\n<p align=\"center\">\n  <img src=\"https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master"
  },
  {
    "path": "Code/Day3_Multiple_Linear_Regression.md",
    "chars": 1277,
    "preview": "# Multiple Linear Regression\n\n\n<p align=\"center\">\n  <img src=\"https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/mast"
  },
  {
    "path": "Info-graphs/readme.md",
    "chars": 19,
    "preview": "Each Day Infograph\n"
  },
  {
    "path": "LICENSE",
    "chars": 1066,
    "preview": "MIT License\n\nCopyright (c) 2018 Avik Jain\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\n"
  },
  {
    "path": "Other Docs/readme.md",
    "chars": 26,
    "preview": "Images for representation\n"
  },
  {
    "path": "README.md",
    "chars": 18245,
    "preview": "# 100-Days-Of-ML-Code\n\n100 Days of Machine Learning Coding as proposed by [Siraj Raval](https://github.com/llSourcell)\n\n"
  },
  {
    "path": "_config.yml",
    "chars": 26,
    "preview": "theme: jekyll-theme-merlot"
  },
  {
    "path": "datasets/50_Startups.csv",
    "chars": 2436,
    "preview": "R&D Spend,Administration,Marketing Spend,State,Profit\r\n165349.2,136897.8,471784.1,New York,192261.83\r\n162597.7,151377.59"
  },
  {
    "path": "datasets/Data.csv",
    "chars": 226,
    "preview": "Country,Age,Salary,Purchased\r\nFrance,44,72000,No\r\nSpain,27,48000,Yes\r\nGermany,30,54000,No\r\nSpain,38,61000,No\r\nGermany,40"
  },
  {
    "path": "datasets/Social_Network_Ads.csv",
    "chars": 10926,
    "preview": "User ID,Gender,Age,EstimatedSalary,Purchased\r\n15624510,Male,19,19000,0\r\n15810944,Male,35,20000,0\r\n15668575,Female,26,430"
  },
  {
    "path": "datasets/readme.md",
    "chars": 30,
    "preview": "Day wise Dataset Used in Code\n"
  },
  {
    "path": "datasets/studentscores.csv",
    "chars": 214,
    "preview": "Hours,Scores\r\n2.5,21\r\n5.1,47\r\n3.2,27\r\n8.5,75\r\n3.5,30\r\n1.5,20\r\n9.2,88\r\n5.5,60\r\n8.3,81\r\n2.7,25\r\n7.7,85\r\n5.9,62\r\n4.5,41\r\n3."
  }
]

About this extraction

This page contains the full source code of the Avik-Jain/100-Days-Of-ML-Code GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 24 files (55.9 KB), approximately 18.8k tokens. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo