You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

1288 lines
389 KiB

2 years ago
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"___\n",
"\n",
"<a href='http://www.pieriandata.com'><img src='../Pierian_Data_Logo.png'/></a>\n",
"___\n",
"<center><em>Copyright by Pierian Data Inc.</em></center>\n",
"<center><em>For more information, visit us at <a href='http://www.pieriandata.com'>www.pieriandata.com</a></em></center>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Logistic Regression Project Exercise \n",
"\n",
"**GOAL: Create a Classification Model that can predict whether or not a person has presence of heart disease based on physical features of that person (age,sex, cholesterol, etc...)**\n",
"\n",
"**Complete the TASKs written in bold below.**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Imports\n",
"\n",
"**TASK: Run the cell below to import the necessary libraries.**"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"import seaborn as sns\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Data\n",
"\n",
"This database contains 14 physical attributes based on physical testing of a patient. Blood samples are taken and the patient also conducts a brief exercise test. The \"goal\" field refers to the presence of heart disease in the patient. It is integer (0 for no presence, 1 for presence). In general, to confirm 100% if a patient has heart disease can be quite an invasive process, so if we can create a model that accurately predicts the likelihood of heart disease, we can help avoid expensive and invasive procedures.\n",
"\n",
"Content\n",
"\n",
"Attribute Information:\n",
"\n",
"* age\n",
"* sex\n",
"* chest pain type (4 values)\n",
"* resting blood pressure\n",
"* serum cholestoral in mg/dl\n",
"* fasting blood sugar > 120 mg/dl\n",
"* resting electrocardiographic results (values 0,1,2)\n",
"* maximum heart rate achieved\n",
"* exercise induced angina\n",
"* oldpeak = ST depression induced by exercise relative to rest\n",
"* the slope of the peak exercise ST segment\n",
"* number of major vessels (0-3) colored by flourosopy\n",
"* thal: 3 = normal; 6 = fixed defect; 7 = reversable defect\n",
"* target:0 for no presence of heart disease, 1 for presence of heart disease\n",
"\n",
"Original Source: https://archive.ics.uci.edu/ml/datasets/Heart+Disease\n",
"\n",
"Creators:\n",
"\n",
"Hungarian Institute of Cardiology. Budapest: Andras Janosi, M.D.\n",
"University Hospital, Zurich, Switzerland: William Steinbrunn, M.D.\n",
"University Hospital, Basel, Switzerland: Matthias Pfisterer, M.D.\n",
"V.A. Medical Center, Long Beach and Cleveland Clinic Foundation: Robert Detrano, M.D., Ph.D."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"----\n",
"\n",
"**TASK: Run the cell below to read in the data.**"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"df = pd.read_csv('../DATA/heart.csv')"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>age</th>\n",
" <th>sex</th>\n",
" <th>cp</th>\n",
" <th>trestbps</th>\n",
" <th>chol</th>\n",
" <th>fbs</th>\n",
" <th>restecg</th>\n",
" <th>thalach</th>\n",
" <th>exang</th>\n",
" <th>oldpeak</th>\n",
" <th>slope</th>\n",
" <th>ca</th>\n",
" <th>thal</th>\n",
" <th>target</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>63</td>\n",
" <td>1</td>\n",
" <td>3</td>\n",
" <td>145</td>\n",
" <td>233</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>150</td>\n",
" <td>0</td>\n",
" <td>2.3</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>37</td>\n",
" <td>1</td>\n",
" <td>2</td>\n",
" <td>130</td>\n",
" <td>250</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>187</td>\n",
" <td>0</td>\n",
" <td>3.5</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>41</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>130</td>\n",
" <td>204</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>172</td>\n",
" <td>0</td>\n",
" <td>1.4</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>56</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>120</td>\n",
" <td>236</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>178</td>\n",
" <td>0</td>\n",
" <td>0.8</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>57</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>120</td>\n",
" <td>354</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>163</td>\n",
" <td>1</td>\n",
" <td>0.6</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" age sex cp trestbps chol fbs restecg thalach exang oldpeak slope \\\n",
"0 63 1 3 145 233 1 0 150 0 2.3 0 \n",
"1 37 1 2 130 250 0 1 187 0 3.5 0 \n",
"2 41 0 1 130 204 0 0 172 0 1.4 2 \n",
"3 56 1 1 120 236 0 1 178 0 0.8 2 \n",
"4 57 0 0 120 354 0 1 163 1 0.6 2 \n",
"\n",
" ca thal target \n",
"0 0 1 1 \n",
"1 0 2 1 \n",
"2 0 2 1 \n",
"3 0 2 1 \n",
"4 0 2 1 "
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([1, 0], dtype=int64)"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df['target'].unique()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Exploratory Data Analysis and Visualization\n",
"\n",
"Feel free to explore the data further on your own.\n",
"\n",
"**TASK: Explore if the dataset has any missing data points and create a statistical summary of the numerical features as shown below.**"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"# CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"RangeIndex: 303 entries, 0 to 302\n",
"Data columns (total 14 columns):\n",
" # Column Non-Null Count Dtype \n",
"--- ------ -------------- ----- \n",
" 0 age 303 non-null int64 \n",
" 1 sex 303 non-null int64 \n",
" 2 cp 303 non-null int64 \n",
" 3 trestbps 303 non-null int64 \n",
" 4 chol 303 non-null int64 \n",
" 5 fbs 303 non-null int64 \n",
" 6 restecg 303 non-null int64 \n",
" 7 thalach 303 non-null int64 \n",
" 8 exang 303 non-null int64 \n",
" 9 oldpeak 303 non-null float64\n",
" 10 slope 303 non-null int64 \n",
" 11 ca 303 non-null int64 \n",
" 12 thal 303 non-null int64 \n",
" 13 target 303 non-null int64 \n",
"dtypes: float64(1), int64(13)\n",
"memory usage: 33.3 KB\n"
]
}
],
"source": []
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"# CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>count</th>\n",
" <th>mean</th>\n",
" <th>std</th>\n",
" <th>min</th>\n",
" <th>25%</th>\n",
" <th>50%</th>\n",
" <th>75%</th>\n",
" <th>max</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>age</th>\n",
" <td>303.0</td>\n",
" <td>54.366337</td>\n",
" <td>9.082101</td>\n",
" <td>29.0</td>\n",
" <td>47.5</td>\n",
" <td>55.0</td>\n",
" <td>61.0</td>\n",
" <td>77.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>sex</th>\n",
" <td>303.0</td>\n",
" <td>0.683168</td>\n",
" <td>0.466011</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>1.0</td>\n",
" <td>1.0</td>\n",
" <td>1.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>cp</th>\n",
" <td>303.0</td>\n",
" <td>0.966997</td>\n",
" <td>1.032052</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>1.0</td>\n",
" <td>2.0</td>\n",
" <td>3.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>trestbps</th>\n",
" <td>303.0</td>\n",
" <td>131.623762</td>\n",
" <td>17.538143</td>\n",
" <td>94.0</td>\n",
" <td>120.0</td>\n",
" <td>130.0</td>\n",
" <td>140.0</td>\n",
" <td>200.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>chol</th>\n",
" <td>303.0</td>\n",
" <td>246.264026</td>\n",
" <td>51.830751</td>\n",
" <td>126.0</td>\n",
" <td>211.0</td>\n",
" <td>240.0</td>\n",
" <td>274.5</td>\n",
" <td>564.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>fbs</th>\n",
" <td>303.0</td>\n",
" <td>0.148515</td>\n",
" <td>0.356198</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>1.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>restecg</th>\n",
" <td>303.0</td>\n",
" <td>0.528053</td>\n",
" <td>0.525860</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>1.0</td>\n",
" <td>1.0</td>\n",
" <td>2.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>thalach</th>\n",
" <td>303.0</td>\n",
" <td>149.646865</td>\n",
" <td>22.905161</td>\n",
" <td>71.0</td>\n",
" <td>133.5</td>\n",
" <td>153.0</td>\n",
" <td>166.0</td>\n",
" <td>202.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>exang</th>\n",
" <td>303.0</td>\n",
" <td>0.326733</td>\n",
" <td>0.469794</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>1.0</td>\n",
" <td>1.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>oldpeak</th>\n",
" <td>303.0</td>\n",
" <td>1.039604</td>\n",
" <td>1.161075</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.8</td>\n",
" <td>1.6</td>\n",
" <td>6.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>slope</th>\n",
" <td>303.0</td>\n",
" <td>1.399340</td>\n",
" <td>0.616226</td>\n",
" <td>0.0</td>\n",
" <td>1.0</td>\n",
" <td>1.0</td>\n",
" <td>2.0</td>\n",
" <td>2.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>ca</th>\n",
" <td>303.0</td>\n",
" <td>0.729373</td>\n",
" <td>1.022606</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>1.0</td>\n",
" <td>4.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>thal</th>\n",
" <td>303.0</td>\n",
" <td>2.313531</td>\n",
" <td>0.612277</td>\n",
" <td>0.0</td>\n",
" <td>2.0</td>\n",
" <td>2.0</td>\n",
" <td>3.0</td>\n",
" <td>3.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>target</th>\n",
" <td>303.0</td>\n",
" <td>0.544554</td>\n",
" <td>0.498835</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>1.0</td>\n",
" <td>1.0</td>\n",
" <td>1.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" count mean std min 25% 50% 75% max\n",
"age 303.0 54.366337 9.082101 29.0 47.5 55.0 61.0 77.0\n",
"sex 303.0 0.683168 0.466011 0.0 0.0 1.0 1.0 1.0\n",
"cp 303.0 0.966997 1.032052 0.0 0.0 1.0 2.0 3.0\n",
"trestbps 303.0 131.623762 17.538143 94.0 120.0 130.0 140.0 200.0\n",
"chol 303.0 246.264026 51.830751 126.0 211.0 240.0 274.5 564.0\n",
"fbs 303.0 0.148515 0.356198 0.0 0.0 0.0 0.0 1.0\n",
"restecg 303.0 0.528053 0.525860 0.0 0.0 1.0 1.0 2.0\n",
"thalach 303.0 149.646865 22.905161 71.0 133.5 153.0 166.0 202.0\n",
"exang 303.0 0.326733 0.469794 0.0 0.0 0.0 1.0 1.0\n",
"oldpeak 303.0 1.039604 1.161075 0.0 0.0 0.8 1.6 6.2\n",
"slope 303.0 1.399340 0.616226 0.0 1.0 1.0 2.0 2.0\n",
"ca 303.0 0.729373 1.022606 0.0 0.0 0.0 1.0 4.0\n",
"thal 303.0 2.313531 0.612277 0.0 2.0 2.0 3.0 3.0\n",
"target 303.0 0.544554 0.498835 0.0 0.0 1.0 1.0 1.0"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Visualization Tasks\n",
"\n",
"**TASK: Create a bar plot that shows the total counts per target value.**"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"# CODE HERE!"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:xlabel='target', ylabel='count'>"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAEGCAYAAACKB4k+AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy86wFpkAAAACXBIWXMAAAsTAAALEwEAmpwYAAAQ/klEQVR4nO3de7BdZX3G8e9jolC8FJgcKCa0oU60BeulnuKtdRTqQMdLMlqcMFIzSJtaqdVOq4XaEaeddJxqbR0rnckoEloLTRElOqOVpipjK+ABtXKRkgpCBMlB6r2DRn/9Y6+8buM+yfHI3uvA/n5mMmuvd71rr9+ZOcmTd13elapCkiSAh/RdgCRp+TAUJEmNoSBJagwFSVJjKEiSmpV9F/CTWLVqVa1du7bvMiTpAeXaa6+9p6pmRm17QIfC2rVrmZub67sMSXpASfLFhbZ5+kiS1BgKkqTGUJAkNYaCJKkxFCRJjaEgSWoMBUlSYyhIkhpDQZLUPKCfaJYezG7/81/quwQtQz/7hs+N9fsdKUiSGkNBktSMLRSSXJBkT5Lr92t/VZKbk9yQ5K+G2s9Nsqvbdsq46pIkLWyc1xQuBP4OuGhfQ5LnAOuBJ1TVfUmO6tqPBzYCJwCPBv4tyWOr6ntjrE+StJ+xjRSq6krg3v2afw94U1Xd1/XZ07WvBy6pqvuq6lZgF3DiuGqTJI026WsKjwV+LcnVST6e5Fe69tXAHUP9dndtPyLJ5iRzSebm5+fHXK4kTZdJh8JK4AjgacBrge1JAmRE3xr1BVW1tapmq2p2Zmbki4MkSUs06VDYDVxWA9cA3wdWde3HDvVbA9w54dokaepNOhTeD5wEkOSxwMOAe4AdwMYkhyQ5DlgHXDPh2iRp6o3t7qMkFwPPBlYl2Q2cB1wAXNDdpvodYFNVFXBDku3AjcBe4GzvPJKkyRtbKFTV6QtsOmOB/luALeOqR5J0cD7RLElqDAVJUmMoSJIaQ0GS1BgKkqTGUJAkNYaCJKkxFCRJjaEgSWoMBUlSYyhIkhpDQZLUGAqSpMZQkCQ1hoIkqTEUJEnN2EIhyQVJ9nRvWdt/2x8nqSSrhtrOTbIryc1JThlXXZKkhY1zpHAhcOr+jUmOBZ4L3D7UdjywETih2+f8JCvGWJskaYSxhUJVXQncO2LT3wCvA2qobT1wSVXdV1W3AruAE8dVmyRptIleU0jyQuBLVfXZ/TatBu4YWt/dtY36js1J5pLMzc/Pj6lSSZpOEwuFJIcBrwfeMGrziLYa0UZVba2q2aqanZmZuT9LlKSpt3KCx3oMcBzw2SQAa4DrkpzIYGRw7FDfNcCdE6xNksQEQ6GqPgcctW89yW3AbFXdk2QH8E9J3go8GlgHXDOJup7y2osmcRg9wFz75pf1XYLUi3Heknox8EngcUl2Jzlrob5VdQOwHbgR+DBwdlV9b1y1SZJGG9tIoapOP8j2tfutbwG2jKseSdLB+USzJKkxFCRJjaEgSWoMBUlSYyhIkhpDQZLUGAqSpMZQkCQ1hoIkqTEUJEmNoSBJagwFSVJjKEiSGkNBktQYCpKkxlCQJDXjfPPaBUn2JLl+qO3NST6f5L+SvC/J4UPbzk2yK8nNSU4ZV12SpIWNc6RwIXDqfm1XAI+vqicA/w2cC5DkeGAjcEK3z/lJVoyxNknSCGMLhaq6Erh3v7aPVNXebvUqYE33eT1wSVXdV1W3AruAE8dVmyRptD6vKbwc+FD3eTVwx9C23V3bj0iyOclckrn5+fkxlyhJ06WXUEjyemAv8J59TSO61ah9q2prVc1W1ezMzMy4SpSkqbRy0gdMsgl4PnByVe37h383cOxQtzXAnZOuTZKm3URHCklOBf4EeGFVfXto0w5gY5JDkhwHrAOumWRtkqQxjhSSXAw8G1iVZDdwHoO7jQ4BrkgCcFVVvaKqbkiyHbiRwWmls6vqe+OqTZI02thCoapOH9H8rgP03wJsGVc9kqSD84lmSVJjKEiSGkNBktQYCpKkxlCQJDWGgiSpMRQkSY2hIElqDAVJUmMoSJIaQ0GS1BgKkqTGUJAkNYaCJKkxFCRJjaEgSWrGFgpJLkiyJ8n1Q21HJrkiyS3d8oihbecm2ZXk5iSnjKsuSdLCxjlSuBA4db+2c4CdVbUO2Nmtk+R4YCNwQrfP+UlWjLE2SdIIYwuFqroSuHe/5vXAtu7zNmDDUPslVXVfVd0K7AJOHFdtkqTRJn1N4eiqugugWx7Vta8G7hjqt7tr+xFJNieZSzI3Pz8/1mIladoslwvNGdFWozpW1daqmq2q2ZmZmTGXJUnTZdKhcHeSYwC65Z6ufTdw7FC/NcCdE65NkqbepENhB7Cp+7wJuHyofWOSQ5IcB6wDrplwbZI09VaO64uTXAw8G1iVZDdwHvAmYHuSs4DbgdMAquqGJNuBG4G9wNlV9b1x1SZJGm1soVBVpy+w6eQF+m8BtoyrHknSwS3q9FGSnYtpkyQ9sB1wpJDkUOAwBqeAjuAHdwk9Cnj0mGuTJE3YwU4f/S7wGgYBcC0/CIWvA+8YX1mSpD4cMBSq6m3A25K8qqrePqGaJEk9WdSF5qp6e5JnAGuH96mqi8ZUlySpB4sKhST/ADwG+Ayw71bRAgwFSXoQWewtqbPA8VU1cuoJSdKDw2KfaL4e+JlxFiJJ6t9iRwqrgBuTXAPct6+xql44lqokSb1YbCi8cZxFSJKWh8XeffTxcRciSerfYu8++gY/eL/Bw4CHAt+qqkeNqzBJ0uQtdqTwyOH1JBvwdZmS9KCzpPcpVNX7gZPu31IkSX1b7OmjFw2tPoTBcws+syBJDzKLvfvoBUOf9wK3Aevv92okSb1a7DWFM+/Pgyb5Q+C3GYw2PgecyWCK7n9mML/SbcBLqup/78/jSpIObLEv2VmT5H1J9iS5O8l7k6xZygGTrAb+AJitqscDK4CNwDnAzqpaB+zs1iVJE7TYC83vBnYweK/CauADXdtSrQR+KslKBiOEOxmcjtrWbd8GbPgJvl+StASLDYWZqnp3Ve3t/lwIzCzlgFX1JeAtwO3AXcDXquojwNFVdVfX5y7gqFH7J9mcZC7J3Pz8/FJKkCQtYLGhcE+SM5Ks6P6cAXxlKQfsXuu5HjiOwcjj4d33LUpVba2q2aqanZlZUi5Jkhaw2FB4OfAS4MsM/nf/mwwuDi/FrwO3VtV8VX0XuAx4BnB3kmMAuuWeJX6/JGmJFhsKfwFsqqqZqjqKQUi8cYnHvB14WpLDkgQ4GbiJwTWLTV2fTcDlS/x+SdISLfY5hScM3x5aVfcmefJSDlhVVye5FLiOwTMPnwa2Ao8Atic5i0FwnLaU75ckLd1iQ+EhSY7YFwxJjvwx9v0RVXUecN5+zfcxGDVIknqy2H/Y/xr4z+5/+MXg+sKWsVUlSerFYp9ovijJHINJ8AK8qKpuHGtlkqSJW/QpoC4EDAJJehBb0tTZkqQHJ0NBktQYCpKkxlCQJDWGgiSpMRQkSY2hIElqDAVJUmMoSJIaQ0GS1BgKkqTGUJAkNYaCJKnpJRSSHJ7k0iSfT3JTkqcnOTLJFUlu6ZZH9FGbJE2zvkYKbwM+XFW/ADyRwTuazwF2VtU6YGe3LkmaoImHQpJHAc8C3gVQVd+pqq8C64FtXbdtwIZJ1yZJ066PkcLPA/PAu5N8Osk7kzwcOLqq7gLolkeN2jnJ5iRzSebm5+cnV7UkTYE+QmEl8MvA31fVk4Fv8WOcKqqqrVU1W1WzMzMz46pRkqZSH6GwG9hdVVd365cyCIm7kxwD0C339FCbJE21iYdCVX0ZuCPJ47qmkxm8+3kHsKlr2wRcPunaJGnarezpuK8C3pPkYcAXgDMZBNT2JGcBtwOn9VSbJE2tXkKhqj4DzI7YdPKES5EkDfGJZklSYyhIkhpDQZLUGAqSpMZQkCQ1hoIkqTEUJEmNoSBJagwFSVJjKEiSGkNBktQYCpKkxlCQJDWGgiSpMRQkSY2hIElqeguFJCuSfDrJB7v1I5NckeSWbnlEX7VJ0rTqc6TwauCmofVzgJ1VtQ7Y2a1Lkiaol1BIsgZ4HvDOoeb1wLb
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TASK: Create a pairplot that displays the relationships between the following columns:**\n",
"\n",
" ['age','trestbps', 'chol','thalach','target']\n",
" \n",
"*Note: Running a pairplot on everything can take a very long time due to the number of features*"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"# CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<seaborn.axisgrid.PairGrid at 0x2573c4e2148>"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAwQAAALaCAYAAACPuJQJAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy86wFpkAAAACXBIWXMAAAsTAAALEwEAmpwYAAEAAElEQVR4nOydeXxU1d3/3+fOkkzWCdlYkqAgoqgoBa2VVrGgoIBoFdwFW7daa+3zVLHWWqvor9Yu1lqfujyPAooKLkVFRaRqW7QqFFcEAZUQBRJCJuskM3Pv/f1xZp87yWSdLOf9euU1mZm7nGS+59w59/s5n68wTROFQqFQKBQKhUIxNNHS3QCFQqFQKBQKhUKRPtSEQKFQKBQKhUKhGMKoCYFCoVAoFAqFQjGEURMChUKhUCgUCoViCKMmBAqFQqFQKBQKxRBGTQgUCoVCoVAoFIohzICeEMyaNcsE1I/66a2fXkHFrfrp5Z9eQcWt+unln15Bxa366eWfQcOAnhDs378/3U1QKDqNilvFQETFrWIgouJWoUiNAT0hUCgUCoVCoVAoFN1DTQgUCoVCoVAoFIohjD3dDVAoFAMXwzCpbfbhC+g47TYKs51omkh3sxSKHkfFukLROVSfGVioCYFCoegShmGybV8jly/bSFWdl7ICFw9dMoXxpblq0FcMKlSsKxSdQ/WZgYeSDCmSUu/1Y5qDahG9ogepbfaFB3uAqjovly/bSG2zL80tUyh6FhXrCkXnUH1m4KEmBApLVrxTyZQl67hs2UZa/Xq6m6Poh/gCeniwD1FV58UXSDFeDAOa9oFnt3w0jF5opWJQ00cx1O1YVyj6in4yrqo+M/BQEwJFAm9+VsNNz33EKLeLv39azc+f/SjdTVL0Q5x2G2UFrpjXygpcOO22jnc2DKjeAg/PgHuOlI/VW9SkQJE6fRhD3Yp1haKv6EfjquozAw81IVAksPztL8l3Obh93pHMOnI4L3zwNdWNrelulqKfUZjt5KFLpoQH/ZBGtDDb2fHOLTXw5PngqZTPPZXyeUtNL7ZYMajowxjqVqwrFH1FPxpXVZ8ZeKhFxYoY9tR7+fvWauZMHIndpjHj8FJe/ngvT2+q4upph6S7eYp+hKYJxpfm8tzVUzvvIhHwRS5aITyV8nWFIhX6MIa6FesKRV/Rj8ZV1WcGHipDoIhh9ftfY5jw3cNKABjpdnH4iFyeem+3WmCsSEDTBMW5GYwqyKI4NyP1wd7uBHdF7GvuCvm6QpEKfRxDXY51haKv6GfjquozAws1IVDE8NaO/ZQXuCjNywy/dvyYQnbVtlB5oCWNLVMMKrKK4bwnIhcvd4V8nlWc3nYpBg4qhhSKWFSfUHQDJRlShAnoBht31XHC2KKY148YmQ/AWztrGV2YnY6mKQYbmgYlE+Cy12Q62+6UFy1N3aNQpIiKIYUiFtUnFN1ATQgUYT75uoEWn86EEbkxr4/Mz6Qgy8HbO2s5/7iKJHsrhgSGIReo9cTFRtMgp7Rn26cYWvRUDPVkXCsU6SS6T6i4VnQCNSFQhHn3iwMAHDYiL+Z1IQQTRubz1s79mKaJEEoHOCQJWdqFXCxC6eiSCeoioxi4qLhWDEZUXCs6iYoKRZh3vzzA8LxMCrISFyAdMSKP/U0+dtY0paFlin5BP7K0Uyh6DBXXisGIimtFJ1ETAkWYLV/XM6bYeo3AISU5AHxYVd+XTVL0J/qRpZ1C0WOouFYMRlRcKzqJmhAoAKj3+vnK08roYVmW749yu8iwa2pCMJTpZ5Z2CkWPoOJaMRhRca3oJGpCoABg654GACoKrScEmiY4qDCbj75SE4Ihi7K0UwxGVFwrBiMqrhWdRC0qVgDwaWhCMCy5rejBxdm8ua2GgG5gt6m55JBDWdopBiMqrhWDERXXik6iJgQKAD7d00hepp2CLEfSbcYUZfPKx3vZWdPM+OG5SbdTDGKUVahiMKLiWjEYUXGt6ARqqqgAYMueBioKs9q1FD24SGYPlGxIoVAoFAqFYvCgMgQKDMNke3UjJ48vaXe7kfkuHDYRXm+gGESECtgYBpg6mObASjGrAjyDg3R8jqmcU8WXYjAQHcc2J2g28Ht7L6ZVvxlQqAmBgq/rvbT6DUa5Xe1up2mCsoIstu1r7KOWKfqEUAGb1++Eb14Jz18zsArZqAI8g4N0fI6pnFPFl2IwYBXH8+6H9bdCU3XPx7TqNwOOPv9UhBDjhRDvR/00CCGuE0IME0KsE0JsDz4W9HXbhiqf1zQDMKKDCQFAWYGLrXvVhGBQESpgc8z5kckADJxCNqoAz+AgHZ9jKudU8aUYDFjF8eqrYep1vRPTqt8MOPo8Q2Ca5jbgGAAhhA34CngOuBFYb5rmb4QQNwafL+7r9g1FQtWHR+ZndrhteUEW/9y+n7pmHwXZys94UBAqYOMq6JFCNoZhUtvswxfQcdptFGY70bTka1O6tF90KhogpyS27aoAz8CjnUJKXY2pBOIlDIZheU4z4CN89CTtMgM+9je2db9NCkUXCPUJwzDQTTBNs/04TNa/XAWR35ONmV2R/qjCaAOOdEuGpgM7TdPcJYSYB0wLvr4UeAM1IegTPq9pJttpI9+V3GEoRHmwcNm2fY0cP6awt5um6AtCBWy8dfIxehDvZCEbwzDZtq+Ry5dtpKrOS1mBi4cumcL40tx2vyx1ar/2Ut9VG8PtNm1O1NezAUQoDuPiz7Q5uxRTCVjFzbmPw/jZsG1NzDm9ho1Mw5THT9Iur2HjrAc2dK9NCkUXCI2Xf1y3jYUnHMziZz7sMA5NmxNhNb576yK/W431XZX+JOk3qjBa/yXdQq7zgCeCv5eaprkHIPjY/gpXRY/xeU0TI/Iz23UYClERnBB8ptYRDB5CBWzefwLOuK9bhWxqm33hL24AVXVeLl+2kdrm9u8KdWq/ZKnvkxaH2+2ZtxSPlp9yuxX9gCSFlDxafpdiKgGruHnqQsyZd8Scs3buUn76YlXk+BbtMs5dwU9frOp+mxSKLhAaL8+eXB6eDED7cejR8vHMWxrbv+bdDxvuaX+s76r0RxVGG3CkLUMghHACZwA/7+R+VwBXAFRUVHSwtSIVdtY0M64kJ6VtC7IcZGfY1DqCTtKv4zZUwGbuH+XdoEtf7rLLkC+ghy9OIarqvPgCesK2hq6jN9Ug9DbyNCffHjOMJzd91eF+yVLRxrBxGNdsptXQ+M2bB/jxDIOC5HX2FCnQp3GbpJBSS31ryjFliWFAcw34W6zjBhvb5zxHtk2nusVkySs1bN7dwC1z9aTt2q/nsnbL611vk6JX6dfjbQ8QGmfdLkfKfaPFZ3DNS83cfMoqSrIEzbqNTHsmo895BGEaYE+yhjDJeKv72xC6jubdby0lUoXRBhzp/GROA/5jmua+4PN9QogRAMHHaqudTNN80DTNKaZpTikuVjPN7tLcFmBvQ2tKC4oBhBCUubPYrjIEnaLfx22ogE3eCMgvA3e5fN7Jwdtpt1FWEBtLZQUunHZbzGuGrmPs24LjkVOw3zuRjEdP5fYTNM6bPKrd/YBIKjoadwXage3Y75tEzoq53DgFspzqwtNd+jxuQ3EYFX+pxpQlIbnD/86AfR9bxs2WmjbqhJtrX9rP2ct3snl3Q+Lx49oltG60SdHr9PvxtpuE+oTH6085Dp12GzVNfs5evpPvPLCDn6/dgztQjXh0NvzpaNlHqrfIPhNNkvH2S48fY98WeHgG3HOkfIzf36I/K/ov6fx0ziciFwJ4HlgY/H0hsLrPWzQE+WK/dBhKZUFxiJHuTHYGnYkUimgKs508dMmU8EUqpGktjFuArjfVYF95QUwa2rHqQn4+rajd/QDrVPS8++HNu8LHcq9eiNtQBfQGA6nGlCXRcocN9yRI4mrnLuWW1/Zy/dMfcu30cSkfv1ttUii6SSj+ntm0m7vOnphSHMbH7G0zhuNevbBjKZDFeFs7dyl+Q0sYw5WL0MAmLZIhIUQWcApwZdTLvwFWCiF+AFQC89PRtqFG5YEWAEo7NSFw8fq2GuU0NJRIsXCZpgnGl+by3NVT23VfEXqbZRo6y2awYfHJuJw
"text/plain": [
"<Figure size 762.375x720 with 20 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TASK: Create a heatmap that displays the correlation between all the columns.**"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"# CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:>"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAqwAAAHWCAYAAACyk9sKAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy86wFpkAAAACXBIWXMAAAsTAAALEwEAmpwYAAEAAElEQVR4nOydd1gUV9uH71lQEJbeFQVEBRXF3kXEjr0bTdOY9qa96TEaNRpbekwxr72b2E2xYO+9URRsWFA6UpYOO98fswILi4JgwC/nzrVXcOaZOb89z3POPHPmzFlJlmUEAoFAIBAIBILqiqqqBQgEAoFAIBAIBA9DJKwCgUAgEAgEgmqNSFgFAoFAIBAIBNUakbAKBAKBQCAQCKo1ImEVCAQCgUAgEFRrRMIqEAgEAoFAIKjWiIRVIBAIBAKBQFAmJElaKklSnCRJoaXslyRJmi9J0jVJkoIlSWpVGeWKhFUgEAgEAoFAUFaWA30fsr8f0FD3eQVYUBmFioRVIBAIBAKBQFAmZFk+BCQ9xGQwsFJWOAFYS5LkUtFyRcIqEAgEAoFAIKgs6gB3ivw7SretQhhX9ATVGW1Mo2r5u7PtJ71W1RIMoq0hVbUEgyS0y6tqCaVilFE97/lqpFRPXdZXq2WTxPZMQlVLKJXIUQ5VLcEgOTbV05cuR6qnrru9tVUtoVTMblbPVMDhQm5VSyiVg39/VOUXzCeV4xi5XH0V5VH+AxbKsrywHKcwVDcV1lo9o1QgEAgEAoFA8I+jS07Lk6AWJwqoW+TfrsC9ColCTAkQCAQCgUAgeOrQPqH/KoE/gOd1qwV0AFJkWY6u6EnFCKtAIBAIBAKBoExIkrQO8AfsJUmKAqYBNQBkWf4V2A4EAteADGB8ZZQrElaBQCAQCASCp4x8+cnMi35UYijL8jOP2C8Db1SeIgUxJUAgEAgEAoFAUK0RI6wCgUAgEAgETxnair94/1QhElaBQCAQCASCp4xKekHqqUFMCRAIBAKBQCAQVGvECKtAIBAIBALBU0a+/O+aEiBGWAUCgUAgEAgE1RoxwvoIJs+FA8fB1gb+XP7PlPn+2O50bu5BVk4uny/ZRcStuBI2te0tmfVafyzVpkTcimPqwh3k5WuxMDPhswl9cHW0Iic3n5lLd3H9biIAz/RuxRA/H2QZrkUlMGPJLnLy8sus68PR/nRupuiavjyI8NsGdNlZMueVQCzNTAm/HcdnS3eSl6/Ms2ndyJX3R3fD2MiIZE0mr3y9QdEV0JIhXX2QJIkth0NYt/f841Qb3VzdmdahB0aSxG8RwSwIPqW3f4hnY17zbQ9ARm4Ok4/u5nJSPAAv+bRmjFdzZFkm/H4CHx7aQXZ+2evmUfi5uTPVrzsqSWJ9WCi/ntXXNtjLm1dbtwMgPTeXz/bvITxB0TavRx+6e9QnMTODfmtWVJomgK6ebkzu449KUrHhfCiLjp3W21/fzobZg3rT1NmR7/YfY+mJswA4W6r5cnBf7NVmaGVYfy6Elacez2+l8d6z3enk60FWdi4zFxluByN6tmBMn1bUdbKm939+IUWTBUCfjt48178tAJnZuXy5fA9X7zyZn1997dMBtPXzIjszh28+3cT1y6X/oMvrkwfSa2grhrX5vMLldmngxuS+/qhUKjaeC2XRkdMlbCb388evoQdZublM2hrEpWilDp9r35KRrX2QkNhwLoSVJxTffTsiEA97GwAsTU1Izcpm6K9rKqTTz82dqd2UGFsfFsKvZ/R1Dvby5tU2iq/Scx7EvuKreT17K7GfkUG/NSsrpMMQ/x3fnY6tPMjKzmPWzzu5Elkyxob3bcGo/q1wdbYhcMIvpKRlAjB2UBt6d20MgJFKhZurLf1fWkCaLgYfl26u7kztqPRjv0cEs+Bisb6iaD+Wl8OUI/r92GhvpR+LSKp4P9aloRuT+vtjpFKx8Uwoiw+VjLFP+/vj5+VBZm4un24K4vI9pQ4tTE2YMbQXDZ3skGWZKZt3c/FONN4uDkwb1AOTGkbkaWVm/rGXkKjYx9b4gLdf7UH7NvXJzs5lznc7uHq95DmnfDAAr4bO5OXlE34lmq9/CiI/X0vnDg146dkuaGWZ/HyZnxbuJeTS3QpretL82166EiOsj2BIP1j41T9XXqfmHtRzsmbYJ0uZvXwPnzzXw6DdmyO7sjboHMM/WUZqehaD/XwAGD+gPVfuxDF26iqmLdrB+2O7A+BgrWZ0z5Y8//laxny2EpVKond7rzLr6uzjTl0na4ZMWcYXq/YwaVyAQbu3h3dlzZ5zDP1sOakZ2QzpouhS1zLhk7EBvPfzH4yavpKP//cXAJ617RjS1YcX5qzjmRmr6Nq8PnUdrcus6wEqSWJmp168sGsjPTctZZBnYxpa2+nZ3ElLYdRf6+i7eTnzzx9nTpfeADiZqRnftBUDtq6i9+blGEkSA+t7l1vDw7R97t+D8ds202f1cgY28qKBra2+tpRUxmz6ncC1K/np1HFmB/Qq2Lfxcijjt22qND1FdU3tG8DEtVvpv2AFA3y88LTX15WcmcWsnQdYoktUH5CvlZm7+xCBC1Yyeuk6xrbxLXFsRejU3IO6TtaM+HApc5ft4aMXDbeD4Kt3eWveRu7Fp+htvxefwuuz1/PslFUs3XaCTyb0Mnh8RWnr14jabna81Pcb5k/bypvTBpdq27BpHcwtTCulXJUkMTUwgJfXbGXAzyvo7+OFp4N+/fs1dMfN1po+85cx9c89TOuvtNmGjnaMbO3DqEXrGPLrKvwb1cfN1hqA9zZuZ+ivaxj66xqCLl1j9+VrFdb5uX8A47duoc+q5Qxs5F0y9lNTGLNxPYFrVvHTqRPM7lEk9i+FMX7r5gppKI2OLT1wdbFh9FtL+fJ/u/ng5Z4G7YLD7/HOjI1Ex+nH2No/zvDih6t48cNV/Lr2MBcuRVU4WVVJEjM69+LFnRvptVHpxxoY6MdG/7WOfpuX8+O548zpWtiPvejTioFbVtFn03JUqor1YypJYsrAAF5dsZWBP6wgsLmBGGvkjpu9NX2/Xca0rXuYNqjwujCpvz9Hrt5kwPcrGPbTam7EJwHwfp+u/LL/BMN+WsNPe47xfp+uj63xAe3b1Me1tg3jXl7E1z/u4r03DLf33Qcu8dyrixn/xjJMTGowoE9zAM5duMWEN5cz8a0VzPt+Bx++3bfCmv4J8pGfyKe6IhLWR9DWF6wt/rnyurX05O9jlwAIvRGNhZkJdlbmJXU1rse+M1cA+PvoJbq1agCAR21bTl+6DcCtmPu42Ftia2kGgLGRCpOaxhipJExr1iA+Ob3sulp48vfxy4quyBjUtUywN6TLuy57z14F4K/jl/Bv4QlAv3Ze7Dt/jZikNADu60YpPFxsCb0RTVZOHvlamXNXoujeskGZdT2ghYMLN1PvcycthVytlj9vhNPLTf88Z+PukZqTDcC5uHu4mBc61khSYWpsjJEkUcu4BrEZZa+bR+Hr5Myt5GTupCra/roaQa/6+trOxdwjNVvRdj4mGme1umDf6Xt3Sc6q2IXQEM1rO3PrfjJRyYquv8Mi6OHlqWeTlJFJSHRswSj5A+I16VyKUUZS0nNyuZGQhJOFmsrCr5UnO47q2sH10tvBlVvxRCekltgeci2atAylPkOvReNo82QacYeAJuzdpoxOhgffQW1hio19ybJUKomXPujHkq93Vkq5zes4czspmaj7KeTma9keWtJ3Pbw82XZRabMXo2KwNDXBQW1OfXtbLkZFk5WrtLnTN6Po2bhkm+vbtBF/h0RUSKevkzO3UorE/pVwetXX13kuOrpY7BfW35OKfYAubT3ZeVCJsbCr0ViYm2BnXTLGrt6MIya+ZIwVpWcXb3YfCa+wphYOLtwq2o9dD6d3sX7sXLF+zPkJ9WPNXPVjbEdwBAGN9X0X0NiTbeeVGAu+E4OFqQn2FuaYm9SkjXsdNp0JBSA3X0talqJZRsbcpCYAalMT4tIq3td26dCAXfvCALgUEY3a3BRbm5K+PHnmRsHfl69E46Brq5lZuQXba5nWqLAewZOhShNWSZK2SpJ0VpKkMEmSXtFte0mSpCu
"text/plain": [
"<Figure size 864x576 with 2 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"----\n",
"----\n",
"\n",
"# Machine Learning\n",
"\n",
"## Train | Test Split and Scaling\n",
"\n",
"**TASK: Separate the features from the labels into 2 objects, X and y.**"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"# CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TASK: Perform a train test split on the data, with the test size of 10% and a random_state of 101.**"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [],
"source": [
"# CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TASK: Create a StandardScaler object and normalize the X train and test set feature data. Make sure you only fit to the training data to avoid data leakage (data knowledge leaking from the test set).**"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
"# CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Logistic Regression Model\n",
"\n",
"**TASK: Create a Logistic Regression model and use Cross-Validation to find a well-performing C value for the hyper-parameter search. You have two options here, use *LogisticRegressionCV* OR use a combination of *LogisticRegression* and *GridSearchCV*. The choice is up to you.**"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [],
"source": [
"# CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"source": [
"# help(LogisticRegressionCV)"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"LogisticRegressionCV()"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TASK: Report back your search's optimal parameters, specifically the C value.** \n",
"\n",
"*Note: You may get a different value than what is shown here depending on how you conducted your search.*"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [],
"source": [
"# CODE HERE"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Coeffecients\n",
"\n",
"**TASK: Report back the model's coefficients.**"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[-0.09621199, -0.39460154, 0.53534731, -0.13850191, -0.08830462,\n",
" 0.02487341, 0.08083826, 0.29914053, -0.33438151, -0.352386 ,\n",
" 0.25101033, -0.49735752, -0.37448551]])"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**BONUS TASK: We didn't show this in the lecture notebooks, but you have the skills to do this! Create a visualization of the coefficients by using a barplot of their values. Even more bonus points if you can figure out how to sort the plot! If you get stuck on this, feel free to quickly view the solutions notebook for hints, there are many ways to do this, the solutions use a combination of pandas and seaborn.**"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [],
"source": [
"#CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAlsAAAFlCAYAAADcXS0xAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy86wFpkAAAACXBIWXMAAAsTAAALEwEAmpwYAAAbBElEQVR4nO3de5xudV0v8M9XNl5RkdjiFsRNhiWVWW5Qk2OQl5diHuCoCYnBKeNgGZl1DNM8njqapXk65QXBeIH3VEQRSDAEb3jhonKREA5qIqRoVmJ2FP2dP9Ya9+Mwe2b2nuc3z57N+/167desZ63frN931u35PL+1Zna11gIAQB93mHUBAAA7MmELAKAjYQsAoCNhCwCgI2ELAKAjYQsAoKN1sy5gMbvvvnvbuHHjrMsAAFjSpZde+rXW2vr587frsLVx48Zccsklsy4DAGBJVfXFhea7jQgA0JGwBQDQkbAFANCRsAUA0JGwBQDQkbAFANCRsAUA0JGwBQDQkbAFANCRsAUA0JGwBQDQkbAFANDRdv0fUQMATPrKX31k1iUkSfY4/sBltzWyBQDQkbAFANCRsAUA0JGwBQDQkbAFANCRsAUA0JGwBQDQkbAFANCRsAUA0JGwBQDQkbAFANCRsAUA0JGwBQDQkbAFANCRsAUA0JGwBQDQkbAFANCRsAUA0JGwBQDQkbAFANCRsAUA0JGwBQDQkbAFANCRsAUA0JGwBQDQkbAFANCRsAUA0JGwBQDQkbAFANCRsAUA0JGwBQDQkbAFANCRsAUA0JGwBQDQkbAFANDRVMJWVT2+qq6pquuq6oRF2u1fVd+rqqdMo18AgO3disNWVe2U5NVJnpBkvyRHVtV+W2j3Z0nOXWmfAABrxTRGtg5Icl1r7frW2neSvC3JoQu0++0kpyf56hT6BABYE6YRtvZM8qWJ1zeM836gqvZMcniSE5daWVUdW1WXVNUlN9988xTKAwCYnWmErVpgXpv3+i+T/EFr7XtLray1dlJrbVNrbdP69eunUB4AwOysm8I6bkhyv4nXeyW5cV6bTUneVlVJsnuSQ6rq1tbau6fQPwDAdmsaYeviJPtW1T5JvpzkiCS/MtmgtbbP3HRVnZrkLEELALg9WHHYaq3dWlXPzvBbhjslOaW1dlVVHTcuX/I5LQCAHdU0RrbSWjsnyTnz5i0Yslprx0yjTwCAtcBfkAcA6EjYAgDoSNgCAOhI2AIA6EjYAgDoSNgCAOhI2AIA6EjYAgDoSNgCAOhI2AIA6EjYAgDoSNgCAOhI2AIA6EjYAgDoSNgCAOhI2AIA6EjYAgDoSNgCAOhI2AIA6EjYAgDoSNgCAOhI2AIA6EjYAgDoaN2sCwAAZuumP79p1iUkSTY8b8OsS+jCyBYAQEfCFgBAR8IWAEBHwhYAQEfCFgBAR8IWAEBHwhYAQEfCFgBAR8IWAEBHwhYAQEfCFgBAR8IWAEBHwhYAQEfrZl0AAOyILnzTzbMuIUly0FHrZ13C7Z6RLQCAjoQtAICOhC0AgI6ELQCAjoQtAICOhC0AgI6ELQCAjoQtAICOhC0AgI6ELQCAjqYStqrq8VV1TVVdV1UnLLD86VV1+fjvoqr6mWn0CwCwvVtx2KqqnZK8OskTkuyX5Miq2m9es88n+YXW2oOT/EmSk1baLwDAWjCNka0DklzXWru+tfadJG9Lcuhkg9baRa21b4wvP55kryn0CwCw3ZtG2NozyZcmXt8wztuSX0/yd1PoFwBgu7duCuuoBea1BRtWHZwhbB24xZVVHZvk2CTZe++9p1AeAMDsTGNk64Yk95t4vVeSG+c3qqoHJ3l9kkNba1/f0spaaye11ja11jatX79+CuUBAMzONMLWxUn2rap9quqOSY5IcuZkg6raO8m7kjyjtfa5KfQJALAmrPg2Ymvt1qp6dpJzk+yU5JTW2lVVddy4/MQkL0ryI0leU1VJcmtrbdNK+wYA2N5N45mttNbOSXLOvHknTkw/M8kzp9EXAMBa4i/IAwB0JGwBAHQkbAEAdCRsAQB0JGwBAHQkbAEAdCRsAQB0JGwBAHQkbAEAdCRsAQB0JGwBAHQkbAEAdCRsAQB0JGwBAHQkbAEAdCRsAQB0JGwBAHQkbAEAdCRsAQB0JGwBAHQkbAEAdCRsAQB0JGwBAHQkbAEAdCRsAQB0JGwBAHQkbAEAdCRsAQB0JGwBAHS0btYFAMByvfKMf5p1CT/w3MPvM+sSWCOMbAEAdCRsAQB0JGwBAHQkbAEAdCRsAQB0JGwBAHQkbAEAdCRsAQB0JGwBAHQkbAEAdCRsAQB0JGwBAHQkbAEAdCRsAQB0JGwBAHS0btYFADBbTz79k7Mu4QdOf/IBsy4Bps7IFgBAR8IWAEBHwhYAQEdTCVtV9fiquqaqrquqExZYXlX1V+Pyy6vq56bRLwDA9m7FYauqdkry6iRPSLJfkiOrar95zZ6QZN/x37FJXrvSfgEA1oJpjGwdkOS61tr1rbXvJHlbkkPntTk0yRva4ONJdq2qDVPoGwBguzaNP/2wZ5IvTby+IcnDltFmzyQ3zV9ZVR2bYfQre++99w8tu/m1b1p5tVOw/llHLdnmhlf92ipUsrS9nn3Kkm0ueP0TV6GSpR38zLOXbHPqaY9bhUqWdszR5y3Z5gXvePwqVLK0lzz1fYsuP+Tdf7hKlSztnMNeuujyJ55+8ipVsrSzn/wbiy5/0jvPWKVKlvbepxy+6PK19OcWnnv4fWZdwrIddNT6WZewbBuet3bGP/Y4/sBZl7DVpjGyVQvMa9vQZpjZ2kmttU2ttU3r16+dAxUAYCHTCFs3JLnfxOu9kty4DW0AAHY40whbFyfZt6r2qao7JjkiyZnz2pyZ5FfH30p8eJJ/ba3d5hYiAMCOZsXPbLXWbq2qZyc5N8lOSU5prV1VVceNy09Mck6SQ5Jcl+Tfk/zXlfYLALAWTOX/RmytnZMhUE3OO3FiuiX5rWn0BQCwlvgL8gAAHQlbAAAdCVsAAB0JWwAAHQlbAAAdCVsAAB0JWwAAHQlbAAAdCVsAAB0JWwAAHQlbAAAdCVsAAB0JWwAAHQlbAAAdCVsAAB0JWwAAHQlbAAAdCVsAAB0JWwAAHQlbAAAdrZt1AQA7ovc+5fBZlwBsJ4xsAQB0ZGQLWDPOfvJvzLoEgK1mZAsAoCNhCwCgI2ELAKAjYQsAoCNhCwCgI2ELAKAjYQsAoCNhCwCgI2ELAKAjYQsAoCNhCwCgI/83ItzOnXPYS2ddAsAOzcgWAEBHwhYAQEfCFgBAR8IWAEBHwhYAQEfCFgBAR8IWAEBHwhYAQEfCFgBAR8IWAEBHwhYAQEfCFgBAR8IWAEBHKwpbVbVbVb2/qq4dv95rgTb3q6oLqurqqrqqqn5nJX0CAKwlKx3ZOiHJ+a21fZOcP76e79Ykv9dae1CShyf5rarab4X9AgCsCSsNW4cmOW2cPi3JYfMbtNZuaq1dNk5/M8nVSfZcYb8AAGvCSsPWHq21m5IhVCW592KNq2pjkp9N8okV9gsAsCasW6pBVf19kvsssOgFW9NRVe2S5PQkz2mt/dsi7Y5NcmyS7L333lvTBQDAdmfJsNVae8yWllXVV6pqQ2vtpqrakOSrW2i3c4ag9ebW2ruW6O+kJCclyaZNm9pS9QEAbM9WehvxzCRHj9NHJ3nP/AZVVUn+JsnVrbVXrrA/AIA1ZaVh62VJHltV1yZ57Pg6VXXfqjpnbPPIJM9I8otV9enx3yEr7BcAYE1Y8jbiYlprX0/y6AXm35jkkHH6I0lqJf0AAKxV/oI8AEBHwhYAQEfCFgBAR8IWAEBHwhYAQEfCFgBAR8IWAEBHwhYAQEfCFgBAR8IWAEBHwhYAQEfCFgBAR8IWAEBHwhYAQEfCFgBAR8IWAEBHwhYAQEfCFgBAR8IWAEBHwhYAQEfCFgBAR8IWAEBHwhYAQEfCFgBAR8IWAEBHwhYAQEfCFgBAR8IWAEBHwhYAQEfCFgBAR8IWAEBHwhYAQEfCFgBAR+tmXQDsqF7y1PfNugQAtgNGtgAAOhK2AAA6ErYAADoStgAAOvKAPGvKMUefN+s
"text/plain": [
"<Figure size 720x432 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---------\n",
"\n",
"## Model Performance Evaluation"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TASK: Let's now evaluate your model on the remaining 10% of the data, the test set.**\n",
"\n",
"**TASK: Create the following evaluations:**\n",
"* Confusion Matrix Array\n",
"* Confusion Matrix Plot\n",
"* Classification Report"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [],
"source": [
"# CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[12, 3],\n",
" [ 2, 14]], dtype=int64)"
]
},
"execution_count": 56,
"metadata": {},
"output_type": "execute_result"
}
],
"source": []
},
{
"cell_type": "code",
"execution_count": 57,
"metadata": {},
"outputs": [],
"source": [
"# CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 58,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<sklearn.metrics._plot.confusion_matrix.ConfusionMatrixDisplay at 0x2573dba6e08>"
]
},
"execution_count": 58,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAATIAAAEKCAYAAACR79kFAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy86wFpkAAAACXBIWXMAAAsTAAALEwEAmpwYAAAVz0lEQVR4nO3deZxeVX3H8c83M0mGEAKEBCQJS6ABGkAojpGl7ChBUdBiBUHRYhGK4AulFqsCxVprldYVZFgKVogSDKKiLGULWBSSsCUBSkQkYUtCQEgIJjPz6x/3TplMJvPc++RZ7p35vnndV+7yPOf+JvPKj3POPedcRQRmZmU2rNkBmJltLCcyMys9JzIzKz0nMjMrPScyMys9JzIzKz0nMjNrGklXSloqaX4/186RFJLGVSrHiczMmukqYHrfk5K2A94JPJOlECcyM2uaiJgNrOjn0n8AnwMyjdhvrWVQG2v0liNi7MS2ZodhOfzxyU2aHYLlsLrzVdZ0rdbGlHHkoZvGSyu6Mn127iN/WgC80etUR0R0DPQdSe8Dno2Ih6VsoRYqkY2d2MZnZ05rdhiWw83T92x2CJbD/7xw7UaX8dKKLu6/ZftMn23Z9sk3IqI9a9mSRgFfAN6VJ6ZCJTIzK74AuumuV/E7A5OBntrYJGCepGkR8cKGvuREZma5BMHayNa0zF12xKPA1j3Hkp4G2iNi+UDfc2e/meXWnfG/SiTNAO4DdpW0RNIp1cTjGpmZ5RIEXTVa/isiTqhwfccs5TiRmVlu3dlGRTSME5mZ5RJAlxOZmZWda2RmVmoBrC3YEvlOZGaWSxBuWppZyQV0FSuPOZGZWT7JyP5icSIzs5xEFxs177zmnMjMLJeks9+JzMxKLBlH5kRmZiXX7RqZmZWZa2RmVnqB6CrYwjlOZGaWm5uWZlZqgVgTLc0OYx1OZGaWSzIg1k1LMys5d/abWalFiK5wjczMSq7bNTIzK7Oks79YqaNY0ZhZ4bmz38wGhS6PIzOzMvPIfjMbFLr91NLMyiyZNO5EZmYlFoi1nqJkZmUWQeEGxBYrGjMrAdGdcatYknSlpKWS5vc693VJj0t6RNINkraoVI4TmZnlEiQ1sixbBlcB0/ucuw3YIyLeCvwv8PlKhTiRmVluXQzLtFUSEbOBFX3O3RoRnenhb4BJlcpxH5mZ5RKokQsr/g3w40ofciIzs1yS18FlTh3jJM3pddwRER1ZvijpC0AncE2lzzqRmVlOuV7Quzwi2nPfQToZOBo4PCKi0uedyMwsl6C+I/slTQf+ATg4Il7P8h0nMjPLrVYrxEqaARxC0gRdApxP8pRyJHCbJIDfRMRpA5XjRGZmuUSoZjWyiDihn9NX5C3HiczMckk6+z1FycxKzWv2m1nJJZ39XljRzErOy/iYWak1eGR/Jk5kZpabXz5iZqUWAWu7ncjMrMSSpqUTmZmVXK1G9teKE1mNPfzFUbx493BGjg0OvvFVABZ+YxNevGs4w4YHo7brZu9/fp3hYyrOg7UGGz6ii699/z6Gj+impSX49R3bcs1luzQ7rMIp4vCLutYPJU2X9ISkRZLOree9imLSsWt4x6Ur1zk3fr+1HPzTVzn4htcYvUM3iy5ra1J0NpC1a4bxj2fsy5knHcSZJx3I2/Zdxq57vNzssAooaVpm2RqlbneS1AJ8DzgKmAqcIGlqve5XFFu1dzJ883VrW+MP6GRYWvfdYq9OVr9YrP+bWQ/xxurkF9XaGrS0difVD1tPrdbsr5V6Ni2nAYsi4ikAST8CjgEW1vGehbd41ggmHLW22WHYBgwbFnzr6nvZdtIqbrp+B55YsGWzQyqc5KllseZa1rPuNxFY3Ot4SXpuHZJOlTRH0pyVKwb3P/AnL21DrTDx6DXNDsU2oLtbnPmRAzn5vYezy+6vsMNOrzU7pMLpGRCbZWuUeiay/n6K9SrqEdEREe0R0T567PA6htNci386ghfvHs4+X1uF3LIsvFUrh/PI3K14235Lmx1KIRWtaVnPRLYE2K7X8STguTrer7CW3tPK765o4+3fXUnLJs2OxjZkzBZ/YtPRSatgxMgu9p62nMVPj25yVMXT89SySDWyevaRPQBMkTQZeBY4HvhwHe9XCPPO2ZSXHmhlzSvivw/bnF3OWM2iy9roXit++4nkH8UWe3Xx1vMzreBrDTR23J/4zHkPM2xYoGHBvbdP4IFfb9PssAppyAyIjYhOSZ8CbgFagCsjYkG97lcU+3xj1Xrntv8r94mVwdOLxnDWRw9sdhiFFyE6h0oiA4iIXwK/rOc9zKzxijYg1iP7zSyXIo7sdyIzs9ycyMys1LywopkNCo0cI5aFE5mZ5RIBnV5Y0czKzk1LMys195GZ2aAQTmRmVnZF6+wvVo+dmRVeRO0mjUu6UtJSSfN7nRsr6TZJT6Z/VlwUzonMzHISXd3DMm0ZXAVM73PuXOD2iJgC3J4eD8iJzMxyi1CmrXI5MRtY0ef0McDV6f7VwLGVynEfmZnlknOu5ThJc3odd0RER4XvbBMRzwNExPOStq50EycyM8snkn6yjJZHRHsdowHctDSzKtR5qesXJW0LkP5Zcb1xJzIzyyVq29nfn58BJ6f7JwM3VvqCE5mZ5RaRbatE0gzgPmBXSUsknQL8K/BOSU8C70yPB+Q+MjPLrVYj+yPihA1cOjxPOU5kZpZLUtsq1sh+JzIzy82Txs2s9HIMv2gIJzIzyyUQ3V5Y0czKrmAVMicyM8vJnf1mNigUrErmRGZmuZWmRibpOwyQdyPirLpEZGaFFkB3d0kSGTBngGtmNlQFUJYaWURc3ftY0qYRsar+IZlZ0RVtHFnFwSCS9pO0EHgsPd5L0sV1j8zMiisybg2SZVTbN4EjgZcAIuJh4KA6xmRmhZZtmetGPhDI9NQyIhZL6wTVVZ9wzKwUCta0zJLIFkvaHwhJI4CzSJuZZjYEBUTBnlpmaVqeBpwBTASeBfZOj81syFLGrTEq1sgiYjlwYgNiMbOyKFjTMstTy50k/VzSsvSNwDdK2qkRwZlZQZXwqeW1wHXAtsAEYCYwo55BmVmB9QyIzbI1SJZEpoj4r4joTLcfUriKpZk1Uq1ePlIrA821HJvu3inpXOBHJAnsQ8BNDYjNzIqqYE8tB+rsn0uSuHoi/mSvawF8uV5BmVmxqWBtsoHmWk5uZCBmVhIN7sjPItPIfkl7AFOBtp5zEfGDegVlZkXW2I78LComMknnA4eQJLJfAkcB9wJOZGZDVcFqZFmeWh5H8tbfFyLi48BewMi6RmVmxdadcWuQLE3L1RHRLalT0hhgKeABsWZDVQEXVsxSI5sjaQvgMpInmfOA++sZlJkVmyLbVrEc6WxJCyTNlzRDUlvlb60vy1zLv0t3vy/pZmBMRDxSzc3MbJCoQR+ZpIkkq+lMjYjVkq4DjgeuylvWQANi9xnoWkTMy3szM7M+WoFNJK0FRgHPVVvIhlw0wLUADqvmhgN5ZUErv9h9y1oXa3V0y3O/aHYIlsO0I/9Yk3JyDIgdJ6n3i4w6IqIDICKelfQN4BlgNXBrRNxaTTwDDYg9tJoCzWyQC/JMUVoeEe39XZC0JXAMMBl4BZgp6aR0PncuWTr7zczWVZtlfI4Afh8RyyJiLTAL2L+acPymcTPLrUZzLZ8B9pU0iqRpeThVvk/XNTIzy68GNbKI+C1wPcmQrkdJ8lFHNeFkmaIkkqWud4qICyVtD7wlIjyWzGyoqtEUpYg4Hzh/Y8vJUiO7GNgPOCE9fg343sbe2MzKKetg2EYu9ZOlj+wdEbGPpAcBIuLl9LVwZjZUlWhhxR5rJbWQViYljaeh00HNrGiKtrBilqblt4EbgK0lfYVkCZ9/qWtUZlZsBXuLUpa5ltdImkvyaFTAsRHhN42bDVUN7v/KIstTy+2B14Gf9z4XEc/UMzAzK7CyJTKSNyb1vISkjWQ6wRPA7nWMy8wKTAXrJc/StNyz93G6KsYnN/BxM7OGyz1FKSLmSXp7PYIxs5IoW9NS0md6HQ4D9gGW1S0iMyu2Mnb2A5v12u8k6TP7SX3CMbNSKFMiSwfCjo6
"text/plain": [
"<Figure size 432x288 with 2 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": []
},
{
"cell_type": "code",
"execution_count": 59,
"metadata": {},
"outputs": [],
"source": [
"# CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" precision recall f1-score support\n",
"\n",
" 0 0.86 0.80 0.83 15\n",
" 1 0.82 0.88 0.85 16\n",
"\n",
" accuracy 0.84 31\n",
" macro avg 0.84 0.84 0.84 31\n",
"weighted avg 0.84 0.84 0.84 31\n",
"\n"
]
}
],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Performance Curves\n",
"\n",
"**TASK: Create both the precision recall curve and the ROC Curve.**"
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {},
"outputs": [],
"source": [
"# CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 65,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<sklearn.metrics._plot.precision_recall_curve.PrecisionRecallDisplay at 0x2573dc46cc8>"
]
},
"execution_count": 65,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYgAAAEGCAYAAAB/+QKOAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy86wFpkAAAACXBIWXMAAAsTAAALEwEAmpwYAAAe8UlEQVR4nO3de3QV9bn/8fdDAoLcZBVEJBpAPCLXiFGhyrUVwUsp2iqIBVGq9GDLUnGBLitydP2kR9tSF5yiFapWJWgrFy2iFVGsRUPQcAtSI0QMF4nckSoGnt8fexM3yYRsYE92Lp/XWntlz8x3Js83l/3Z35k9M+buiIiIlFYn2QWIiEjVpIAQEZFACggREQmkgBARkUAKCBERCZSa7AISqXnz5t6mTZtklyEiUm2sWLHiS3dvEbSsRgVEmzZtyMnJSXYZIiLVhpl9Vt4y7WISEZFACggREQmkgBARkUAKCBERCaSAEBGRQKEFhJnNMrPtZramnOVmZo+bWb6ZrTKz7jHLBprZ+uiyiWHVKCIi5QtzBPE0MPAYywcB50YftwF/BDCzFGB6dHlHYJiZdQyxThERCRDaeRDuvtTM2hyjyWDgWY9cb/x9MzvNzFoBbYB8d98AYGZZ0bZ5YdU6+ZW15G3ZG9bmReQ4Dc5ozY2XnJ3sMmq9ZB6DaA18HjNdGJ1X3vxAZnabmeWYWU5RUVEohYpI5cnbupf5uZuTXYaQ3DOpLWCeH2N+IHd/EngSIDMz84TufjTpmk4nspqIhOCGJ5YluwSJSmZAFAJnxUynAVuAeuXMFxGRSpTMXUwLgBHRTzP1APa4+1ZgOXCumbU1s3rA0GhbERGpRKGNIMxsNtAXaG5mhcAkoC6Au88AFgJXAvnAAWBUdFmxmd0BvA6kALPcfW1YdYqISLAwP8U0rILlDowtZ9lCIgEiIiJJojOpRUQkkAJCREQCKSBERCSQAkJERAIpIEREJJACQkREAikgREQkkAJCREQCKSBERCSQAkJERAIpIEREJJACQkREAikgREQkkAJCREQCKSBERCSQAkJERAIpIEREJJACQkREAikgREQkkAJCREQCKSBERCSQAkJERAIpIEREJJACQkREAikgREQkkAJCREQCKSBERCSQAkJERAIpIEREJJACQkREAikgREQkkAJCREQCKSBERCSQAkJERAIpIEREJJACQkREAoUaEGY20MzWm1m+mU0MWN7MzOaa2SozyzazzjHLCsxstZnlmllOmHWKiEhZqWFt2MxSgOnA5UAhsNzMFrh7Xkyz+4Bcdx9iZh2i7X8Qs7yfu38ZVo0iIlK+MEcQFwP57r7B3Q8CWcDgUm06AosB3P1joI2ZtQyxJhERiVOYAdEa+DxmujA6L9ZK4FoAM7sYSAfSossceMPMVpjZbeV9EzO7zcxyzCynqKgoYcWLiNR2YQaEBczzUtNTgGZmlgv8EvgIKI4uu9TduwODgLFm1jvom7j7k+6e6e6ZLVq0SEzlIiIS3jEIIiOGs2Km04AtsQ3cfS8wCsDMDNgYfeDuW6Jft5vZXCK7rJaGWK+IiMQIcwSxHDjXzNqaWT1gKLAgtoGZnRZdBjAaWOrue82soZk1jrZpCAwA1oRYq4iIlBLaCMLdi83sDuB1IAWY5e5rzWxMdPkM4HzgWTM7BOQBt0ZXbwnMjQwqSAVecPdFYdUqIiJlhbmLCXdfCCwsNW9GzPNlwLkB620AuoVZm4iIHJvOpBYRkUAKCBERCaSAEBGRQAoIEREJpIAQEZFACggREQmkgBARkUAKCBERCRTqiXIiIsn0wgebmJ+7OWHbG5zRmhsvOTth26vqNIIQkRprfu5m8rbuTci28rbuTWjYVAcaQYhIjdaxVRPm3N7zpLdzwxPLElBN9aIRhIiIBNIIQkSqlA827gQS8449b+teOrZqctLbqa00ghCRGqtjqyYMzih9p2OJl0YQIlIlJeK4gZwcjSBERCSQAkJERAIpIEREJJCOQYhIlXJZ++bJLkGiFBAiUqU8N/qSZJcgUdrFJCIigRQQIiISSAEhIiKBFBAiIhJIASEiIoEUECIiEkgBISIigRQQIiISSAEhIiKB4jqT2swuBR4E0qPrGODu3i680kREJJnivdTGTOBOYAVwKLxyRESkqog3IPa4+2uhViIiIlVKvAGxxMweBV4Gvjky090/DKUqERFJungD4sjlFTNj5jnQP7HliIhIVRFXQLh7v7ALERGRqiWuj7maWVMz+52Z5UQfvzWzpnGsN9DM1ptZvplNDFjezMzmmtkqM8s2s87xrisiIuGK9zyIWcA+4ProYy/w52OtYGYpwHRgENARGGZmHUs1uw/IdfeuwAjgD8exroiIhCjegDjH3Se5+4boYzJQ0TkQFwP50fYHgSxgcKk2HYHFAO7+MdDGzFrGua6IiIQo3oD4j5lddmQieuLcfypYpzXwecx0YXRerJXAtdFtXkzkRLy0ONc9UsttR3Z9FRUVxdEVERGJR7yfYvoF8Ez0uIMBO4GbK1jHAuZ5qekpwB/MLBdYDXwEFMe5bmSm+5PAkwCZmZmBbURE5PjF+ymmXKCbmTWJTu+NY7VC4KyY6TRgS6nt7gVGAZiZARujj1MrWldERMJ1zIAws5vc/Tkzu6vUfADc/XfHWH05cK6ZtQU2A0OBG0tt5zTgQPQ4w2hgqbvvNbMK1xURkXBVNIJoGP3a+Hg37O7FZnYH8DqQAsxy97VmNia6fAZwPvCsmR0C8oBbj7Xu8dYgIiIn7pgB4e5PRL9OPpGNu/tCYGGpeTNini8Dzo13XRERqTzxnij3v2bWxMzqmtliM/vSzG4KuzgREUmeeD/mOiB6QPlqIgef/wu4J7SqREQk6eINiLrRr1cCs919Z0j1iIhIFRHveRCvmNnHRE6O+28zawF8HV5ZIiKSbHGNINx9ItATyHT3b4Gv0KUvRERqtIrOg+jv7m+Z2bUx82KbvBxWYSIiklwV7WLqA7wFXBOwzFFAiIjUWBWdBzEp+nVU5ZQjIiJVRbznQfy/6GUxjkw3M7OHQ6tKRESSLt6PuQ5y991HJtx9F5GPvIqISA0Vb0CkmNkpRybMrAFwyjHai4hINRfveRDPAYvN7M9EDk7fAjwTWlUiIpJ08d4P4n/NbBXwQyI383nI3V8PtTIREUmqeEcQAOuAYnd/08xONbPG7r4vrMJERCS54v0U08+BvwJPRGe1BuaFVJOIiFQB8R6kHgtcCuwFcPdPgNPDKkpERJIv3oD4JnpbUADMLJXIwWoREamh4g2Id8zsPqCBmV0OvAS8El5ZIiKSbPEGxASgCFgN3E7kVqD3h1WUiIgkX4WfYjKzOsAqd+8M/Cn8kkREpCqocATh7oeBlWZ2diXUIyIiVUS850G0AtaaWTaRmwUB4O4/CqUqERFJungDYnKoVYiISJVT0R3l6gNjgPZEDlDPdPfiyihMRESSq6JjEM8AmUTCYRDw29ArEhGRKqGiXUwd3b0LgJnNBLLDL0lERKqCikYQ3x55ol1LIiK1S0UjiG5mtjf63IicSb03+tzdvUmo1YmISNIcMyDcPaWyChERkaol3kttiIhILaOAEBGRQAoIEREJpIAQEZFACggREQmkgBARkUAKCBERCRRqQJjZQDNbb2b5ZjYxYHlTM3vFzFaa2VozGxWzrMDMVptZrpnlhFmniIiUFe/lvo+bmaUA04HLgUJguZktcPe8mGZjgTx3v8bMWgDrzex5dz8YXd7P3b8Mq0YRESlfmCOIi4F8d98QfcHPAgaXauNAYzMzoBGwE9A1n0REqoAwA6I18HnMdGF0XqxpwPnAFiKXFB8XvcUpRMLjDTNbYWa3lfdNzOw2M8sxs5yioqLEVS8iUsuFGRAWMM9LTV8B5AJnAhnANDM7cgHAS929O5H7UIw1s95B38Tdn3T3THfPbNGiRUIKFxGRcAOiEDgrZjqNyEgh1ijgZY/IBzYCHQDcfUv063ZgLpFdViIiUknCDIjlwLlm1tbM6gFDgQWl2mwCfgBgZi2B84ANZtbQzBpH5zcEBgBrQqxVRERKCe1TTO5ebGZ3AK8DKcA
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": []
},
{
"cell_type": "code",
"execution_count": 66,
"metadata": {},
"outputs": [],
"source": [
"# CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 67,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"<sklearn.metrics._plot.roc_curve.RocCurveDisplay at 0x2573dc484c8>"
]
},
"execution_count": 67,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEGCAYAAABo25JHAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy86wFpkAAAACXBIWXMAAAsTAAALEwEAmpwYAAAjOklEQVR4nO3deXxV5b3v8c+vDEKR6chQJDJoAWUKQlRQEbA9FJRTZxCsWpUX0tbhtJdesFrH9liv1FKOWC5VrlrL4IiUKmoriBUVgsQAQRQVMQgVUYZIUYbf/WOt7LMJO8kOydqbZH3fr9d+Za+1nrXW79lJ9m89a3gec3dERCS+vpHtAEREJLuUCEREYk6JQEQk5pQIRERiTolARCTm6mc7gKpq1aqVd+rUKdthiIjUKitWrPjM3VunWlbrEkGnTp3Iz8/PdhgiIrWKmX1U3jKdGhIRiTklAhGRmFMiEBGJOSUCEZGYUyIQEYm5yBKBmc00s0/NbHU5y83MpprZejMrNLO+UcUiIiLli7JF8DAwrILlw4Eu4Wsc8IcIYxERkXJE9hyBuy8xs04VFDkPeNSDfrDfMLMWZtbO3TdHFZNU36w3N/JswaZshyESS92PbcZt/9GjxrebzWsE7YGPk6aLw3mHMLNxZpZvZvlbt27NSHCS2rMFmyjavDPbYYhIDcrmk8WWYl7KUXLcfQYwAyAvL08j6WRZ93bNmHvtgGyHISI1JJstgmLguKTpHOCTLMUiIhJb2UwE84ErwruH+gM7dH1ARCTzIjs1ZGazgcFAKzMrBm4DGgC4+3TgOeAcYD2wG7gqqlhERKR8Ud41NLqS5Q78JKr9i4hIevRksYhIzCkRiIjEnBKBiEjMKRGIiMScEoGISMwpEYiIxJwSgYhIzCkRiIjEnBKBiEjMKRGIiMScEoGISMwpEYiIxJwSgYhIzCkRiIjEnBKBiEjMKRGIiMScEoGISMwpEYiIxFxkQ1XGwaw3N/JswaZsh5FRRZt30r1ds2yHISI1SC2Cani2YBNFm3dmO4yM6t6uGef1aZ/tMESkBqlFUE3d2zVj7rUDsh2GiMhhU4tARCTmlAhERGJOiUBEJOaUCEREYk6JQEQk5pQIRERiTolARCTmlAhERGJOiUBEJOaUCEREYi7SRGBmw8xsnZmtN7NJKZY3N7O/mNnbZrbGzK6KMh4RETlUZInAzOoB04DhQHdgtJl1L1PsJ0CRu+cCg4HfmlnDqGISEZFDRdkiOBVY7+4fuPvXwBzgvDJlHGhqZgYcDXwO7IswJhERKSPKRNAe+Dhpujicl+x+4CTgE2AVcKO7Hyi7ITMbZ2b5Zpa/devWqOIVEYmlKBOBpZjnZaa/BxQAxwJ9gPvN7JBRT9x9hrvnuXte69atazpOEZFYizIRFAPHJU3nEBz5J7sKeNoD64EPgRMjjElERMqIMhEsB7qYWefwAvClwPwyZTYC3wEws7ZAN+CDCGMSEZEyIhuhzN33mdl1wAtAPWCmu68xs/Hh8unAXcDDZraK4FTSRHf/LKqYRETkUJEOVenuzwHPlZk3Pen9J8DQKGMQEZGK6cliEZGYUyIQEYk5JQIRkZiL9BrBkWTWmxt5tmBTjW6zaPNOurc75LEHEZFaJTYtgmcLNlG0eWeNbrN7u2ac16fsw9IiIrVLbFoEEHxxz712QLbDEBE5osSmRSAiIqkpEYiIxJwSgYhIzCkRiIjEXNqJwMyaRBmIiIhkR6WJwMxON7MiYG04nWtmD0QemYiIZEQ6LYLfEQwgsw3A3d8GzooyKBERyZy0Tg25+8dlZu2PIBYREcmCdB4o+9jMTgc8HGDmBsLTRCIiUvul0yIYD/yEYOD5YoKxhX8cYUwiIpJB6bQIurn7ZckzzOwM4LVoQhIRkUxKp0Xw32nOExGRWqjcFoGZDQBOB1qb2c+SFjUjGINYRETqgIpODTUEjg7LNE2avxO4OMqgREQkc8pNBO7+CvCKmT3s7h9lMCYREcmgdC4W7zaze4EeQKPSme5+dmRRiYhIxqRzsfjPwDtAZ+AOYAOwPMKYREQkg9JJBMe4+0PAXnd/xd2vBvpHHJeIiGRIOqeG9oY/N5vZucAnQE50IYmISCalkwh+ZWbNgf9F8PxAM+A/owxKREQyp9JE4O4Lwrc7gCGQeLJYRETqgIoeKKsHjCToY2ihu682sxHAL4DGwMmZCVFERKJUUYvgIeA4YBkw1cw+AgYAk9x9XgZiExGRDKgoEeQBvd39gJk1Aj4Dvu3uWzITmoiIZEJFt49+7e4HANx9D/BuVZOAmQ0zs3Vmtt7MJpVTZrCZFZjZGjN7pSrbFxGR6quoRXCimRWG7w04IZw2wN29d0UbDq8xTAP+nWAcg+VmNt/di5LKtAAeAIa5+0Yza3P4VRERkcNRUSI4qZrbPhVY7+4fAJjZHOA8oCipzBjgaXffCODun1ZznyIiUkUVdTpX3Y7m2gPJYx0XA6eVKdMVaGBmiwl6OP29uz9adkNmNg4YB9ChQ4dqhiUiIsnSGrz+MFmKeV5muj7QDzgX+B7wSzPreshK7jPcPc/d81q3bl3zkYqIxFg6TxYfrmKC209L5RB0T1G2zGfu/iXwpZktAXKBdyOMS0REkqTVIjCzxmbWrYrbXg50MbPOZtYQuBSYX6bMs8BAM6tvZt8kOHW0tor7ERGRaqg0EZjZfwAFwMJwuo+Zlf1CP4S77wOuA14g+HJ/3N3XmNl4MxsfllkbbreQ4MG1B9199WHWRUREDkM6p4ZuJ7gDaDGAuxeYWad0Nu7uzwHPlZk3vcz0vcC96WxPRERqXjqnhva5+47IIxERkaxIp0Ww2szGAPXMrAtwA7A02rBERCRT0mkRXE8wXvFXwCyC7qj/M8KYREQkg9JpEXRz95uBm6MORkREMi+dFsF9ZvaOmd1lZj0ij0hERDKq0kTg7kOAwcBWYIaZrTKzW6IOTEREMiOtB8rcfYu7TwXGEzxTcGuUQYmISOak80DZSWZ2u5mtBu4nuGMoJ/LIREQkI9K5WPz/gNnAUHcv21eQiIjUcpUmAnfvn4lAREQkO8pNBGb2uLuPNLNVHNx9dFojlImISO1QUYvgxvDniEwEIiIi2VHuxWJ33xy+/bG7f5T8An6cmfBERCRq6dw++u8p5g2v6UBERCQ7KrpG8COCI//jzawwaVFT4LWoAxMRkcyo6BrBLOB54G5gUtL8Xe7+eaRRiYhIxlSUCNzdN5jZT8ouMLN/UzIQEakbKmsRjABWENw+aknLHDg+wrhERCRDyk0E7j4i/Nk5c+GIiEimpdPX0Blm1iR8/wMzu8/MOkQfmoiIZEI6t4/+AdhtZrnA/wY+Av4UaVQiIpIx6Q5e78B5wO/d/fcEt5CKiEgdkE7vo7vM7CbgcmCgmdUDGkQbloiIZEo6LYJRBAPXX+3uW4D2wL2RRiUiIhmTzlCVW4A/A83NbASwx90fjTwyERHJiHTuGhoJLAMuAUYCb5rZxVEHJiIimZHONYKbgVPc/VMAM2sN/A14MsrAREQkM9K5RvCN0iQQ2pbmeiIiUguk0yJYaGYvEIxbDMHF4+eiC0lERDIpnTGLf25mFwJnEvQ3NMPdn4k8MhERyYiKxiPoAkwGTgBWARPcfVOmAhMRkcyo6Fz/TGABcBFBD6T/XdWNm9kwM1tnZuvNbFIF5U4xs/26G0lEJPMqOjXU1N3/GL5fZ2ZvVWXD4RPI0wiGuiwGlpvZfHcvSlHuHuCFqmxfRERqRkWJoJGZncz/jEPQOHna3StLDKcC6939AwAzm0PQX1FRmXLXA08Bp1QxdhERqQEVJYLNwH1J01uSph04u5Jttwc+TpouBk5LLmBm7YELwm2VmwjMbBwwDqBDB/WALSJSkyoamGZINbdtKeZ5mekpwER332+WqngilhnADIC8vLyy2xARkWpI5zmCw1UMHJc0nQN8UqZMHjAnTAKtgHPMbJ+7z4swLhERSRJlIlgOdDGzzsAm4FJgTHKB5GEwzexhYIGSgIhIZkWWCNx9n5ldR3A3UD1gpruvMbPx4fLpUe1bRETSV2kisOC8zWXA8e5+Zzh
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Final Task: A patient with the following features has come into the medical office:**\n",
"\n",
" age 48.0\n",
" sex 0.0\n",
" cp 2.0\n",
" trestbps 130.0\n",
" chol 275.0\n",
" fbs 0.0\n",
" restecg 1.0\n",
" thalach 139.0\n",
" exang 0.0\n",
" oldpeak 0.2\n",
" slope 2.0\n",
" ca 0.0\n",
" thal 2.0"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TASK: What does your model predict for this patient? Do they have heart disease? How \"sure\" is your model of this prediction?**\n",
"\n",
"*For convience, we created an array of the features for the patient above*"
]
},
{
"cell_type": "code",
"execution_count": 68,
"metadata": {},
"outputs": [],
"source": [
"patient = [[ 54. , 1. , 0. , 122. , 286. , 0. , 0. , 116. , 1. ,\n",
" 3.2, 1. , 2. , 2. ]]"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 71,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0], dtype=int64)"
]
},
"execution_count": 71,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# EXPECTED PREDICTION"
]
},
{
"cell_type": "code",
"execution_count": 72,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[9.99999862e-01, 1.38455917e-07]])"
]
},
"execution_count": 72,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# EXPECTED PROBABILITY PER CLASS (Basically model should be extremely sure its in the 0 class)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"----\n",
"\n",
"## Great Job!"
]
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 1
}