You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

1617 lines
325 KiB

2 years ago
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"___\n",
"\n",
"<a href='http://www.pieriandata.com'><img src='../Pierian_Data_Logo.png'/></a>\n",
"___\n",
"<center><em>Copyright by Pierian Data Inc.</em></center>\n",
"<center><em>For more information, visit us at <a href='http://www.pieriandata.com'>www.pieriandata.com</a></em></center>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Supervised Learning Capstone Project - Tree Methods Focus - SOLUTIONS\n",
"\n",
"\n",
"\n",
"## Make sure to review the introduction video to understand the 3 ways of approaching this project exercise!\n",
"\n",
"----\n",
"\n",
"**Ways to approach the project:**\n",
" 1. Open a new notebook, read in the data, and then analyze and visualize whatever you want, then create a predictive model.\n",
" 2. Use this notebook as a general guide, completing the tasks in bold shown below.\n",
" 3. Skip to the solutions notebook and video, and treat project at a more relaxing code along walkthrough lecture series.\n",
"\n",
"------\n",
"------\n",
"\n",
"## GOAL: Create a model to predict whether or not a customer will Churn .\n",
"\n",
"----\n",
"----\n",
"\n",
"\n",
"## Complete the Tasks in Bold Below!\n",
"\n",
"## Part 0: Imports and Read in the Data\n",
"\n",
"**TASK: Run the filled out cells below to import libraries and read in your data. The data file is \"Telco-Customer-Churn.csv\"**"
]
},
{
"cell_type": "code",
"execution_count": 132,
"metadata": {},
"outputs": [],
"source": [
"# RUN THESE CELLS TO START THE PROJECT!\n",
"import numpy as np\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt\n",
"import seaborn as sns"
]
},
{
"cell_type": "code",
"execution_count": 133,
"metadata": {},
"outputs": [],
"source": [
"df = pd.read_csv('../DATA/Telco-Customer-Churn.csv')"
]
},
{
"cell_type": "code",
"execution_count": 134,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>customerID</th>\n",
" <th>gender</th>\n",
" <th>SeniorCitizen</th>\n",
" <th>Partner</th>\n",
" <th>Dependents</th>\n",
" <th>tenure</th>\n",
" <th>PhoneService</th>\n",
" <th>MultipleLines</th>\n",
" <th>InternetService</th>\n",
" <th>OnlineSecurity</th>\n",
" <th>...</th>\n",
" <th>DeviceProtection</th>\n",
" <th>TechSupport</th>\n",
" <th>StreamingTV</th>\n",
" <th>StreamingMovies</th>\n",
" <th>Contract</th>\n",
" <th>PaperlessBilling</th>\n",
" <th>PaymentMethod</th>\n",
" <th>MonthlyCharges</th>\n",
" <th>TotalCharges</th>\n",
" <th>Churn</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>7590-VHVEG</td>\n",
" <td>Female</td>\n",
" <td>0</td>\n",
" <td>Yes</td>\n",
" <td>No</td>\n",
" <td>1</td>\n",
" <td>No</td>\n",
" <td>No phone service</td>\n",
" <td>DSL</td>\n",
" <td>No</td>\n",
" <td>...</td>\n",
" <td>No</td>\n",
" <td>No</td>\n",
" <td>No</td>\n",
" <td>No</td>\n",
" <td>Month-to-month</td>\n",
" <td>Yes</td>\n",
" <td>Electronic check</td>\n",
" <td>29.85</td>\n",
" <td>29.85</td>\n",
" <td>No</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>5575-GNVDE</td>\n",
" <td>Male</td>\n",
" <td>0</td>\n",
" <td>No</td>\n",
" <td>No</td>\n",
" <td>34</td>\n",
" <td>Yes</td>\n",
" <td>No</td>\n",
" <td>DSL</td>\n",
" <td>Yes</td>\n",
" <td>...</td>\n",
" <td>Yes</td>\n",
" <td>No</td>\n",
" <td>No</td>\n",
" <td>No</td>\n",
" <td>One year</td>\n",
" <td>No</td>\n",
" <td>Mailed check</td>\n",
" <td>56.95</td>\n",
" <td>1889.50</td>\n",
" <td>No</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>3668-QPYBK</td>\n",
" <td>Male</td>\n",
" <td>0</td>\n",
" <td>No</td>\n",
" <td>No</td>\n",
" <td>2</td>\n",
" <td>Yes</td>\n",
" <td>No</td>\n",
" <td>DSL</td>\n",
" <td>Yes</td>\n",
" <td>...</td>\n",
" <td>No</td>\n",
" <td>No</td>\n",
" <td>No</td>\n",
" <td>No</td>\n",
" <td>Month-to-month</td>\n",
" <td>Yes</td>\n",
" <td>Mailed check</td>\n",
" <td>53.85</td>\n",
" <td>108.15</td>\n",
" <td>Yes</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>7795-CFOCW</td>\n",
" <td>Male</td>\n",
" <td>0</td>\n",
" <td>No</td>\n",
" <td>No</td>\n",
" <td>45</td>\n",
" <td>No</td>\n",
" <td>No phone service</td>\n",
" <td>DSL</td>\n",
" <td>Yes</td>\n",
" <td>...</td>\n",
" <td>Yes</td>\n",
" <td>Yes</td>\n",
" <td>No</td>\n",
" <td>No</td>\n",
" <td>One year</td>\n",
" <td>No</td>\n",
" <td>Bank transfer (automatic)</td>\n",
" <td>42.30</td>\n",
" <td>1840.75</td>\n",
" <td>No</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>9237-HQITU</td>\n",
" <td>Female</td>\n",
" <td>0</td>\n",
" <td>No</td>\n",
" <td>No</td>\n",
" <td>2</td>\n",
" <td>Yes</td>\n",
" <td>No</td>\n",
" <td>Fiber optic</td>\n",
" <td>No</td>\n",
" <td>...</td>\n",
" <td>No</td>\n",
" <td>No</td>\n",
" <td>No</td>\n",
" <td>No</td>\n",
" <td>Month-to-month</td>\n",
" <td>Yes</td>\n",
" <td>Electronic check</td>\n",
" <td>70.70</td>\n",
" <td>151.65</td>\n",
" <td>Yes</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 21 columns</p>\n",
"</div>"
],
"text/plain": [
" customerID gender SeniorCitizen Partner Dependents tenure PhoneService \\\n",
"0 7590-VHVEG Female 0 Yes No 1 No \n",
"1 5575-GNVDE Male 0 No No 34 Yes \n",
"2 3668-QPYBK Male 0 No No 2 Yes \n",
"3 7795-CFOCW Male 0 No No 45 No \n",
"4 9237-HQITU Female 0 No No 2 Yes \n",
"\n",
" MultipleLines InternetService OnlineSecurity ... DeviceProtection \\\n",
"0 No phone service DSL No ... No \n",
"1 No DSL Yes ... Yes \n",
"2 No DSL Yes ... No \n",
"3 No phone service DSL Yes ... Yes \n",
"4 No Fiber optic No ... No \n",
"\n",
" TechSupport StreamingTV StreamingMovies Contract PaperlessBilling \\\n",
"0 No No No Month-to-month Yes \n",
"1 No No No One year No \n",
"2 No No No Month-to-month Yes \n",
"3 Yes No No One year No \n",
"4 No No No Month-to-month Yes \n",
"\n",
" PaymentMethod MonthlyCharges TotalCharges Churn \n",
"0 Electronic check 29.85 29.85 No \n",
"1 Mailed check 56.95 1889.50 No \n",
"2 Mailed check 53.85 108.15 Yes \n",
"3 Bank transfer (automatic) 42.30 1840.75 No \n",
"4 Electronic check 70.70 151.65 Yes \n",
"\n",
"[5 rows x 21 columns]"
]
},
"execution_count": 134,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Part 1: Quick Data Check\n",
"\n",
"**TASK: Confirm quickly with .info() methods the datatypes and non-null values in your dataframe.**"
]
},
{
"cell_type": "code",
"execution_count": 135,
"metadata": {},
"outputs": [],
"source": [
"# CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 136,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"RangeIndex: 7032 entries, 0 to 7031\n",
"Data columns (total 21 columns):\n",
" # Column Non-Null Count Dtype \n",
"--- ------ -------------- ----- \n",
" 0 customerID 7032 non-null object \n",
" 1 gender 7032 non-null object \n",
" 2 SeniorCitizen 7032 non-null int64 \n",
" 3 Partner 7032 non-null object \n",
" 4 Dependents 7032 non-null object \n",
" 5 tenure 7032 non-null int64 \n",
" 6 PhoneService 7032 non-null object \n",
" 7 MultipleLines 7032 non-null object \n",
" 8 InternetService 7032 non-null object \n",
" 9 OnlineSecurity 7032 non-null object \n",
" 10 OnlineBackup 7032 non-null object \n",
" 11 DeviceProtection 7032 non-null object \n",
" 12 TechSupport 7032 non-null object \n",
" 13 StreamingTV 7032 non-null object \n",
" 14 StreamingMovies 7032 non-null object \n",
" 15 Contract 7032 non-null object \n",
" 16 PaperlessBilling 7032 non-null object \n",
" 17 PaymentMethod 7032 non-null object \n",
" 18 MonthlyCharges 7032 non-null float64\n",
" 19 TotalCharges 7032 non-null float64\n",
" 20 Churn 7032 non-null object \n",
"dtypes: float64(2), int64(2), object(17)\n",
"memory usage: 1.1+ MB\n"
]
}
],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TASK: Get a quick statistical summary of the numeric columns with .describe() , you should notice that many columns are categorical, meaning you will eventually need to convert them to dummy variables.**"
]
},
{
"cell_type": "code",
"execution_count": 137,
"metadata": {},
"outputs": [],
"source": [
"# CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 138,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>SeniorCitizen</th>\n",
" <th>tenure</th>\n",
" <th>MonthlyCharges</th>\n",
" <th>TotalCharges</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>count</th>\n",
" <td>7032.000000</td>\n",
" <td>7032.000000</td>\n",
" <td>7032.000000</td>\n",
" <td>7032.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>0.162400</td>\n",
" <td>32.421786</td>\n",
" <td>64.798208</td>\n",
" <td>2283.300441</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>0.368844</td>\n",
" <td>24.545260</td>\n",
" <td>30.085974</td>\n",
" <td>2266.771362</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>18.250000</td>\n",
" <td>18.800000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25%</th>\n",
" <td>0.000000</td>\n",
" <td>9.000000</td>\n",
" <td>35.587500</td>\n",
" <td>401.450000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>0.000000</td>\n",
" <td>29.000000</td>\n",
" <td>70.350000</td>\n",
" <td>1397.475000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75%</th>\n",
" <td>0.000000</td>\n",
" <td>55.000000</td>\n",
" <td>89.862500</td>\n",
" <td>3794.737500</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>1.000000</td>\n",
" <td>72.000000</td>\n",
" <td>118.750000</td>\n",
" <td>8684.800000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" SeniorCitizen tenure MonthlyCharges TotalCharges\n",
"count 7032.000000 7032.000000 7032.000000 7032.000000\n",
"mean 0.162400 32.421786 64.798208 2283.300441\n",
"std 0.368844 24.545260 30.085974 2266.771362\n",
"min 0.000000 1.000000 18.250000 18.800000\n",
"25% 0.000000 9.000000 35.587500 401.450000\n",
"50% 0.000000 29.000000 70.350000 1397.475000\n",
"75% 0.000000 55.000000 89.862500 3794.737500\n",
"max 1.000000 72.000000 118.750000 8684.800000"
]
},
"execution_count": 138,
"metadata": {},
"output_type": "execute_result"
}
],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Part 2: Exploratory Data Analysis\n",
"\n",
"## General Feature Exploration\n",
"\n",
"**TASK: Confirm that there are no NaN cells by displaying NaN values per feature column.**"
]
},
{
"cell_type": "code",
"execution_count": 139,
"metadata": {},
"outputs": [],
"source": [
"# CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 140,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"customerID 0\n",
"gender 0\n",
"SeniorCitizen 0\n",
"Partner 0\n",
"Dependents 0\n",
"tenure 0\n",
"PhoneService 0\n",
"MultipleLines 0\n",
"InternetService 0\n",
"OnlineSecurity 0\n",
"OnlineBackup 0\n",
"DeviceProtection 0\n",
"TechSupport 0\n",
"StreamingTV 0\n",
"StreamingMovies 0\n",
"Contract 0\n",
"PaperlessBilling 0\n",
"PaymentMethod 0\n",
"MonthlyCharges 0\n",
"TotalCharges 0\n",
"Churn 0\n",
"dtype: int64"
]
},
"execution_count": 140,
"metadata": {},
"output_type": "execute_result"
}
],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TASK:Display the balance of the class labels (Churn) with a Count Plot.**"
]
},
{
"cell_type": "code",
"execution_count": 141,
"metadata": {},
"outputs": [],
"source": [
"# CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src='fig1.png' >"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TASK: Explore the distrbution of TotalCharges between Churn categories with a Box Plot or Violin Plot.**"
]
},
{
"cell_type": "code",
"execution_count": 143,
"metadata": {},
"outputs": [],
"source": [
"# CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src='fig2.png' >"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TASK: Create a boxplot showing the distribution of TotalCharges per Contract type, also add in a hue coloring based on the Churn class.**"
]
},
{
"cell_type": "code",
"execution_count": 145,
"metadata": {},
"outputs": [],
"source": [
"#CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src='fig3.png' >"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TASK: Create a bar plot showing the correlation of the following features to the class label. Keep in mind, for the categorical features, you will need to convert them into dummy variables first, as you can only calculate correlation for numeric features.**\n",
"\n",
" ['gender', 'SeniorCitizen', 'Partner', 'Dependents','PhoneService', 'MultipleLines', \n",
" 'OnlineSecurity', 'OnlineBackup', 'DeviceProtection', 'TechSupport', 'InternetService',\n",
" 'StreamingTV', 'StreamingMovies', 'Contract', 'PaperlessBilling', 'PaymentMethod']\n",
"\n",
"***Note, we specifically listed only the features above, you should not check the correlation for every feature, as some features have too many unique instances for such an analysis, such as customerID***"
]
},
{
"cell_type": "code",
"execution_count": 147,
"metadata": {},
"outputs": [],
"source": [
"#CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 148,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Index(['customerID', 'gender', 'SeniorCitizen', 'Partner', 'Dependents',\n",
" 'tenure', 'PhoneService', 'MultipleLines', 'InternetService',\n",
" 'OnlineSecurity', 'OnlineBackup', 'DeviceProtection', 'TechSupport',\n",
" 'StreamingTV', 'StreamingMovies', 'Contract', 'PaperlessBilling',\n",
" 'PaymentMethod', 'MonthlyCharges', 'TotalCharges', 'Churn'],\n",
" dtype='object')"
]
},
"execution_count": 148,
"metadata": {},
"output_type": "execute_result"
}
],
"source": []
},
{
"cell_type": "code",
"execution_count": 149,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": 150,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Contract_Two year -0.301552\n",
"StreamingMovies_No internet service -0.227578\n",
"StreamingTV_No internet service -0.227578\n",
"TechSupport_No internet service -0.227578\n",
"DeviceProtection_No internet service -0.227578\n",
"OnlineBackup_No internet service -0.227578\n",
"OnlineSecurity_No internet service -0.227578\n",
"InternetService_No -0.227578\n",
"PaperlessBilling_No -0.191454\n",
"Contract_One year -0.178225\n",
"OnlineSecurity_Yes -0.171270\n",
"TechSupport_Yes -0.164716\n",
"Dependents_Yes -0.163128\n",
"Partner_Yes -0.149982\n",
"PaymentMethod_Credit card (automatic) -0.134687\n",
"InternetService_DSL -0.124141\n",
"PaymentMethod_Bank transfer (automatic) -0.118136\n",
"PaymentMethod_Mailed check -0.090773\n",
"OnlineBackup_Yes -0.082307\n",
"DeviceProtection_Yes -0.066193\n",
"MultipleLines_No -0.032654\n",
"MultipleLines_No phone service -0.011691\n",
"PhoneService_No -0.011691\n",
"gender_Male -0.008545\n",
"gender_Female 0.008545\n",
"PhoneService_Yes 0.011691\n",
"MultipleLines_Yes 0.040033\n",
"StreamingMovies_Yes 0.060860\n",
"StreamingTV_Yes 0.063254\n",
"StreamingTV_No 0.128435\n",
"StreamingMovies_No 0.130920\n",
"Partner_No 0.149982\n",
"SeniorCitizen 0.150541\n",
"Dependents_No 0.163128\n",
"PaperlessBilling_Yes 0.191454\n",
"DeviceProtection_No 0.252056\n",
"OnlineBackup_No 0.267595\n",
"PaymentMethod_Electronic check 0.301455\n",
"InternetService_Fiber optic 0.307463\n",
"TechSupport_No 0.336877\n",
"OnlineSecurity_No 0.342235\n",
"Contract_Month-to-month 0.404565\n",
"Name: Churn_Yes, dtype: float64"
]
},
"execution_count": 150,
"metadata": {},
"output_type": "execute_result"
}
],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src='figbar.png'>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"---\n",
"\n",
"# Part 3: Churn Analysis\n",
"\n",
"**This section focuses on segementing customers based on their tenure, creating \"cohorts\", allowing us to examine differences between customer cohort segments.**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TASK: What are the 3 contract types available?**"
]
},
{
"cell_type": "code",
"execution_count": 152,
"metadata": {},
"outputs": [],
"source": [
"# CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 153,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array(['Month-to-month', 'One year', 'Two year'], dtype=object)"
]
},
"execution_count": 153,
"metadata": {},
"output_type": "execute_result"
}
],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TASK: Create a histogram displaying the distribution of 'tenure' column, which is the amount of months a customer was or has been on a customer.**"
]
},
{
"cell_type": "code",
"execution_count": 154,
"metadata": {},
"outputs": [],
"source": [
"#CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src=\"fig5.png\">"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TASK: Now use the seaborn documentation as a guide to create histograms separated by two additional features, Churn and Contract.**"
]
},
{
"cell_type": "code",
"execution_count": 156,
"metadata": {},
"outputs": [],
"source": [
"#CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src=\"fig6.png\">"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TASK: Display a scatter plot of Total Charges versus Monthly Charges, and color hue by Churn.**"
]
},
{
"cell_type": "code",
"execution_count": 158,
"metadata": {},
"outputs": [],
"source": [
"#CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 159,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Index(['customerID', 'gender', 'SeniorCitizen', 'Partner', 'Dependents',\n",
" 'tenure', 'PhoneService', 'MultipleLines', 'InternetService',\n",
" 'OnlineSecurity', 'OnlineBackup', 'DeviceProtection', 'TechSupport',\n",
" 'StreamingTV', 'StreamingMovies', 'Contract', 'PaperlessBilling',\n",
" 'PaymentMethod', 'MonthlyCharges', 'TotalCharges', 'Churn'],\n",
" dtype='object')"
]
},
"execution_count": 159,
"metadata": {},
"output_type": "execute_result"
}
],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src='fig7.png'>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Creating Cohorts based on Tenure\n",
"\n",
"**Let's begin by treating each unique tenure length, 1 month, 2 month, 3 month...N months as its own cohort.**\n",
"\n",
"**TASK: Treating each unique tenure group as a cohort, calculate the Churn rate (percentage that had Yes Churn) per cohort. For example, the cohort that has had a tenure of 1 month should have a Churn rate of 61.99%. You should have cohorts 1-72 months with a general trend of the longer the tenure of the cohort, the less of a churn rate. This makes sense as you are less likely to stop service the longer you've had it.**"
]
},
{
"cell_type": "code",
"execution_count": 161,
"metadata": {},
"outputs": [],
"source": [
"#CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 162,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": 163,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": 164,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"tenure\n",
"1 61.990212\n",
"2 51.680672\n",
"3 47.000000\n",
"4 47.159091\n",
"5 48.120301\n",
" ... \n",
"68 9.000000\n",
"69 8.421053\n",
"70 9.243697\n",
"71 3.529412\n",
"72 1.657459\n",
"Name: customerID, Length: 72, dtype: float64"
]
},
"execution_count": 164,
"metadata": {},
"output_type": "execute_result"
}
],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TASK: Now that you have Churn Rate per tenure group 1-72 months, create a plot showing churn rate per months of tenure.**"
]
},
{
"cell_type": "code",
"execution_count": 165,
"metadata": {},
"outputs": [],
"source": [
"#CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src='fig9.png'>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Broader Cohort Groups\n",
"**TASK: Based on the tenure column values, create a new column called Tenure Cohort that creates 4 separate categories:**\n",
" * '0-12 Months'\n",
" * '24-48 Months'\n",
" * '12-24 Months'\n",
" * 'Over 48 Months' "
]
},
{
"cell_type": "code",
"execution_count": 167,
"metadata": {},
"outputs": [],
"source": [
"# CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 168,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": 169,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": 170,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>tenure</th>\n",
" <th>Tenure Cohort</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>0-12 Months</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>34</td>\n",
" <td>24-48 Months</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>2</td>\n",
" <td>0-12 Months</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>45</td>\n",
" <td>24-48 Months</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>2</td>\n",
" <td>0-12 Months</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>8</td>\n",
" <td>0-12 Months</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>22</td>\n",
" <td>12-24 Months</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>10</td>\n",
" <td>0-12 Months</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>28</td>\n",
" <td>24-48 Months</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>62</td>\n",
" <td>Over 48 Months</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" tenure Tenure Cohort\n",
"0 1 0-12 Months\n",
"1 34 24-48 Months\n",
"2 2 0-12 Months\n",
"3 45 24-48 Months\n",
"4 2 0-12 Months\n",
"5 8 0-12 Months\n",
"6 22 12-24 Months\n",
"7 10 0-12 Months\n",
"8 28 24-48 Months\n",
"9 62 Over 48 Months"
]
},
"execution_count": 170,
"metadata": {},
"output_type": "execute_result"
}
],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TASK: Create a scatterplot of Total Charges versus Monthly Charts,colored by Tenure Cohort defined in the previous task.**"
]
},
{
"cell_type": "code",
"execution_count": 171,
"metadata": {},
"outputs": [],
"source": [
"#CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src='fig10.png'>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TASK: Create a count plot showing the churn count per cohort.**"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src='cplot.png'>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TASK: Create a grid of Count Plots showing counts per Tenure Cohort, separated out by contract type and colored by the Churn hue.**"
]
},
{
"cell_type": "code",
"execution_count": 174,
"metadata": {},
"outputs": [],
"source": [
"#CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src='fig11.png'>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"-----\n",
"\n",
"# Part 4: Predictive Modeling\n",
"\n",
"**Let's explore 4 different tree based methods: A Single Decision Tree, Random Forest, AdaBoost, Gradient Boosting. Feel free to add any other supervised learning models to your comparisons!**\n",
"\n",
"\n",
"## Single Decision Tree"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TASK : Separate out the data into X features and Y label. Create dummy variables where necessary and note which features are not useful and should be dropped.**"
]
},
{
"cell_type": "code",
"execution_count": 178,
"metadata": {},
"outputs": [],
"source": [
"#CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 181,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": 182,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TASK: Perform a train test split, holding out 10% of the data for testing. We'll use a random_state of 101 in the solutions notebook/video.**"
]
},
{
"cell_type": "code",
"execution_count": 183,
"metadata": {},
"outputs": [],
"source": [
"#CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 184,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": 185,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TASK: Decision Tree Perfomance. Complete the following tasks:**\n",
" 1. Train a single decision tree model (feel free to grid search for optimal hyperparameters).\n",
" 2. Evaluate performance metrics from decision tree, including classification report and plotting a confusion matrix.\n",
" 2. Calculate feature importances from the decision tree.\n",
" 4. OPTIONAL: Plot your tree, note, the tree could be huge depending on your pruning, so it may crash your notebook if you display it with plot_tree."
]
},
{
"cell_type": "code",
"execution_count": 222,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": 227,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" precision recall f1-score support\n",
"\n",
" No 0.87 0.89 0.88 557\n",
" Yes 0.55 0.49 0.52 147\n",
"\n",
" accuracy 0.81 704\n",
" macro avg 0.71 0.69 0.70 704\n",
"weighted avg 0.80 0.81 0.81 704\n",
"\n"
]
}
],
"source": []
},
{
"cell_type": "code",
"execution_count": 228,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<sklearn.metrics._plot.confusion_matrix.ConfusionMatrixDisplay at 0x2d1e9601d90>"
]
},
"execution_count": 228,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAUIAAAEGCAYAAAAQZJzmAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAd4UlEQVR4nO3de5he473/8fcn5xARkbCDBKngh02oOraq0W6CitqU0tYme6NFtX5tUftCdXNVi2i24peWStCDQ1uHxvmw0ZZISLOJIhXHhByFSERm5vv7Y90TT8Yc1kyeZ555Zn1e17WuWetep3syl6/7Xvda91cRgZlZkfWodgXMzKrNgdDMCs+B0MwKz4HQzArPgdDMCq9XtSvQXkMG94ythveudjWsHV6ctV61q2Dt9B5LF0XE0I6ef+Dn1o/FS+pzHTtj1qp7I+Kgjt6rHGouEG41vDfT7h1e7WpYOxy42ehqV8Ha6YG49dV1OX/xknqm3Tsi17E9h700ZF3uVQ41FwjNrOsLoIGGalcjNwdCMyu7IFgd+brGXYEDoZlVhFuEZlZoQVBfQ5/vOhCaWUU04EBoZgUWQL0DoZkVnVuEZlZoAaz2M0IzK7Ig3DU2s4ILqK+dOOhAaGbll31ZUjscCM2sAkQ9qnYlcnMgNLOyywZLHAjNrMCy9wgdCM2s4BrcIjSzInOL0MwKLxD1NZQJxIHQzCrCXWMzK7RAfBg9q12N3Gqn7WpmNSN7obpHriUvST0lPSPprrR9vaS5kmamZXQql6SJkuZImiVpt7au7RahmVVEBQZLzgCeBwaWlH0vIm5tctxYYFRa9gSuTj9b5BahmZVdhKiPHrmWPCRtARwC/DLH4eOAKZF5AhgkaVhrJzgQmllFNKBcCzBE0vSS5aRmLncF8H0+/gnzRan7O0FS31S2OfB6yTFvpLIWuWtsZmWXDZbkDi+LImL3lnZKOhRYEBEzJO1fsusc4C2gDzAJOAu4sCP1dSA0s7JrHCwpk32BwyQdDPQDBkq6MSK+mvavkvQr4Ltp+01geMn5W6SyFrlrbGYVUR/KtbQlIs6JiC0iYivgGOChiPhq43M/SQIOB55Np9wBfD2NHu8FLIuI+a3dwy1CMyu7Tvqy5CZJQwEBM4FTUvlU4GBgDrACOKGtCzkQmllFNOQcEW6PiHgEeCStj2nhmABObc91HQjNrOyySRdq58mbA6GZlV0gVtfQJ3YOhGZWdhHkflm6K3AgNLMKWPOydE1wIDSzsgvcIjQz82CJmRVbIE/MambFlqXzrJ3wUjs1NbMa4gTvZlZwQWW+LKkUB0Izqwi3CM2s0CLkFqGZFVs2WOJP7Mys0OQXqs2s2LLBEj8jNLOCq6UvS2qnpmZWMxq/LMmz5NVMgvetJT2ZErn/TlKfVN43bc9J+7dq69oOhGZWEQ30yLW0Q2OC90aXABMiYhtgKTA+lY8HlqbyCem4VjkQmlnZRcDqhh65ljyaJnhPCZvGALemQyaTJXCCLMH75LR+K3BAOr5FfkZoZmWXdY1zt7OGSJpesj0pIiY1OeYKsgTvG6TtjYF3IqIubZcmcV+T4D0i6iQtS8cvaqkCDoRmVhHt+LKkowney8aBsJPV18PpB23LxsNW86Mpc5n5+AB+ceFmrF4tRu28kjMve42eveD9d3twyWlbsmBeH+rr4MhTFnLgMUuqXf1Cm/zkbFYu70lDA9TXidPHbsvIHVZy+o/foP/6Dbz9Rh8uOXUEK5bXzovElVLm12c+luAd+BkwSFKv1CosTeLemOD9DUm9gA2Bxa3doFOeEUoKSZeVbH9X0gWdce+u5o+/HMrwUasAaGiAn54xgnOufpVJD7/AJpt/yP03DwbgjuuHMGLbD7jmgRf46W1zmHThZqz+sHbey+quvn/UJ/jmF7bj9LHbAvDtS1/nuouHccoB2/Hnuwdy5DcWVLmGXUXWNc6ztKWFBO/HAQ8DR6bDjgduT+t3pG3S/odSis8WddZgySrgCElDOul+XdLCeb2Z9uBAxh6b/c/p3aU96d0n2OITWWDc7bPv8fjUQQBIsPL9nkTAB+/3ZINB9fTs1erf0qpgi5Gr+N8n1gfgmUc34NOHLKtyjbqOhpS3pK1lHZwFnClpDtkzwGtT+bXAxqn8TODsti7UWYGwDpgEfKfpDklbSXpI0ixJD0oa0Ul16nTXnL85//6f81D6V99wcD31deLFv/UH4PG7BrFwXm8ADjthEa+91Jdjd92Rk8dsxzcufJMeHuOvrhAX/+ZlrrznRcYel/3P7NUX+7H3Qe8C8JlDlzF0s9XVrGGXkY0a98y1tO+68UhEHJrWX46IPSJim4g4KiJWpfIP0vY2af/LbV23M//T+jlwnKQNm5T/NzA5InYGbgImNj1R0kmSpkuavnBxfSdUtfyeuH8gg4bUMWrnlWvKJDjn6le45vzNOf3gUfQfUL8m2M14ZAM+seNKfv3Mc1x1/wv8/NzNef89R8JqOvPwbTjtwG0597itOezfFrHTnsu5/MzhfPH4RVx5z4v0H1BPnR9fAJV5obqSOm2wJCLelTQF+BawsmTX3sARaf0G4CfNnDuJrEXJ7rv0q8n+4eyn1ueJ+wby1IM78OEqseK9nlxy2gjOuvI1Lv/jHCALfm+83BeA+343mC+ftgAJNt/6Q/5pxIe8Pqcf2++6opq/RqEtfitrrS9b3Js/37Mh2++6gluv2YQffOUTAGw+chV7HvBuNavYpdRSOs/ObmJcQfbW9/qdfN+qO/EH87lpxmymTJvNOVe/yi6ffo+zrnyNdxZl/y/6cJW4+apNOPRrWZdr6OarmflY9srU0oW9eOMffRk2YlXV6l90ffvX03/9+jXrn/zse7zy935suHHWFZaCY894m7tu2Lia1ewyGkeN3SJsRkQskXQzWTC8LhX/hWwk6AbgOOCxzqxTtd1y1SY8+cBAogEOOX4xoz+9HIDjvv0Wl357BCeP2Y4IGH/ufDbcuDYfC3QHGw2t4/xrXwGgZ6/g4T9sxPRHBnL4+IV88d+y93T/fPeG3PfbwVWsZddSSxOzqo1R5fLcRFoeEQPS+qbAXOAnEXGBpC2BXwFDgIXACRHxWkvX2n2XfjHt3uEVr7OVz4Gbja52FaydHohbZ7T2knNbNtp+kxhz3ZFtHwj8ft+r1+le5dApLcLGIJjW3wbWK9l+leybQTPrRrpKtzcPf1liZmXniVnNzHAgNLOCa3yPsFY4EJpZRdTSe4QOhGZWdhFQl3PS1a7AgdDMKsJdYzMrND8jNDMDwoHQzIrOgyVmVmgRtfWMsHaGdcyshoj6hh65ljavJPWTNE3S3yQ9J+mHqfx6SXMlzUzL6FQuSRNTgvdZknZr6x5uEZpZRZTxGeEqYExELJfUG3hc0t1p3/ci4tYmx48FRqVlT+Dq9LNFDoRmVnbl/NY4JV5anjZ7p6W1abPGAVPSeU9IGiRpWETMb+kEd43NrPwie06YZyEleC9ZTmp6OUk9Jc0EFgD3R8STaddFqfs7QVLfVLYmwXtSmvy9WW4RmllFtGPUuNUE7wARUQ+MljQI+IOknYBzgLeAPmSpPM4CLuxIXd0iNLOyizIOlqx13Yh3yPIZHxQR8yOzimxy5z3SYY0J3huVJn9vlgOhmVVEO7rGrZI0NLUEkdQf+ALwd0nDUpmAw4Fn0yl3AF9Po8d7Actaez4I7hqbWYWUcdR4GDBZUk+yxtvNEXFXyoc+FBAwEzglHT8VOBiYA6wATmjrBg6EZlZ2WWuvbKPGs4BdmylvNsVHGi0+tT33cCA0s4qopS9LHAjNrCI6IUFm2TgQmlnZBaLBE7OaWdHVUIPQgdDMKqCMgyWdwYHQzCqjhpqEDoRmVhHdokUo6b9pJaZHxLcqUiMzq3kBNDR0g0AITO+0WphZ9xJAd2gRRsTk0m1J60XEispXycy6g1p6j7DNF30k7S1pNvD3tL2LpKsqXjMzq22Rc+kC8rzxeAVwILAYICL+BuxXwTqZWc0TEfmWriDXqHFEvJ7NdLNGfWWqY2bdRhdp7eWRJxC+LmkfIFLilDOA5ytbLTOraQFRQ6PGebr
"text/plain": [
"<Figure size 432x288 with 2 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": []
},
{
"cell_type": "code",
"execution_count": 229,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": 230,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAACRoAAAYjCAYAAACSohEUAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAB7CAAAewgFu0HU+AAEAAElEQVR4nOzdebh153w38O/vyZOZmIKEIIYiRUwRgkiCUkJNL1J9a6ig5paaqu9LJ1qqFby0VZWooWhrnqfEFEkQYyIRQ8xkJAmZ7/ePtY5n5Tj7rH2mfZ7kfD7Xta619lr3uu/f2Wft/aTOt/ddrbUAAAAAAAAAAAAsZtN6FwAAAAAAAAAAAGz9BI0AAAAAAAAAAIBRgkYAAAAAAAAAAMAoQSMAAAAAAAAAAGCUoBEAAAAAAAAAADBK0AgAAAAAAAAAABglaAQAAAAAAAAAAIwSNAIAAAAAAAAAAEYJGgEAAAAAAAAAAKMEjQAAAAAAAAAAgFGCRgAAAAAAAAAAwChBIwAAAAAAAAAAYJSgEQAAAAAAAAAAMErQCAAAAAAAAAAAGCVoBAAAAAAAAAAAjBI0AgAAAAAAAAAARgkaAQAAAAAAAAAAowSNAAAAAAAAAACAUYJGAAAAAAAAAADAKEEjAAAAANjKVNVDq+o9VfXDqrqgqlq/HbnetbF8VXXgrH6XVXX4YKxHr+VYAAAAbByCRgAAAKyKqjpy8AfNpWwHrnftrL15f1xv610PbK2q86Ykb0tyvyTXSbLd+la1ekb+rTi/qn5aVd+sqk9X1Sur6tFVdf31rputV/+MLOe/P/y3CQAAwDIIGgEAAMASmCGCy7OqeuHg+X3hetfDgh7Rb3OOTXJ4kv/Xb+9Yh5pmZfsk10pykyR3SfKUJK9P8p2qel9V3Xs9iwMAAACSzetdAAAAAFdIx6X74/g0friWhQBczvzh4PgFrbW/WrdK1t78fys2JblKkqsmuUWSGwzO3zfJfavq8CRPa62dM7sy2cqdmC6Et5gHpZsdLJnuv1H8twkAAMAEgkYAAACshfe31l643kUAXA7dbnD8unWrYjYW/beiqnZLF7x6WpI9+tOPTnKLqjqgtfarNa9wlbXWjkxSMxrr0eneryu01toxSY5ZrE1V3TJbgkb+GwUAAGAFLJ0GAAAAAFuPqw2Of7xuVWwFWms/aa29NMleSd4+uHSHdMvJAQAAADMmaAQAAAAAW49fz0DeWrt0PQvZWrTWzk3y8CTvG5x+WFXdbZ1KAgAAgA1L0AgAAICtUnUeVFVHVNXJVfXzqjq/qr5fVe+sqkdV1VRLglfVXlX1p1X1P1V1UlWdU1UXVdVpVfX5qvqnqvrtkT6+W1UtyaMGp19fVW2B7YUL3dtve05R7+GD9o+etk1VXbWqnl5Vn6yqH1bVxf31qy5w/6q9v6upqo4c/FwH9ud2r6oXVNXxVXVmX+c3qurvqurqC/SxR1W9qG9/Vv/7/lJV/XlV7Tgy/p6D8b87OH+PqnpLVX2rqn7VPzufqqqnVNX2S/wZb1FVL+3rO72qLqiqH/U/+3Oq6hpT9PHoQZ2H9+e2qapDqupdVfXtvs5WVQ+ce1+TvGDQzQsmPL+HLzDejn0/r6iqT1fVT6vqwqo6t3++31FVj62q7aao/cDBWEcOzt+9qv6zr/38qjqjf5afUlXbTvHWDse4dlU9u6o+UlXf69+LX/XHH+iv7TlFP9tW1R9W1dv6us6pqvOq6jv98/CgqlrxMlg1+I6Yd/43fj+L9DHz52qlP/dStNZakkcmOWdw+vnT3FvdvwEvqqpjB8/uaVV1TFX9VVVdZ7yXy/S3TVU9rKreUN2/KWdV92/KGX2fh1X3nfEbz8ak53/CODevqpdU1ef63+mF/WfjZ1X1hap6fXXf1VebcP/ovyXz2l+pqp5WVR+qqh/0Y51VVV+rqldV1R2nfH9+43mtqptV1cur6sT+e+MXVfXlqnpxVe06Tb+zUJP/Dfrz/vn5SVVdUlVnT7h/1b8zVvv5BQAAWKmZ/w+GAAAAMKaq9k5yRJLbLHB5j357QJLnVdWDW2snLNLX25I8dMLlXfvt9kmeXlWHJfmz1tolKyh/XVTVXZK8Jcn1pmi7au/vWquqeyV5c5L5IYmbJXlOkkOq6oDW2ql9+z9K8uok88M/t+63R1TVQa2106Ycf9skr0ry+HmXdkhy1357UlU9sLV28khfm5P8Y5InJdlm3uXd++2AJM+tqj9prR0xTY1939dJ8ta+nlXVhws+muRKC1zeNsnOSW6Q5IFJ/qJ/Zo5fQv/bpXuPHzfv0vZJ9u+3x1TVvVtrp4/0tSnJX6R7NnZaoMn1+u13k7y4qm416fnuQwb/luTGC1zes98OSfK5qvpfrbUfLlbbWrm8PlfL0Vo7s7oA1FP7U79TVVdvrZ25UPvqQoCHJTk0v/nezH3/75vkz6rq2a21V43VUFX7p3subrrA5av3/e2b5GlJ/j7Jc8f6nDDOC9M9y/PrTpJr9tvtkjw6yZuS/O/ljDMY735JXptkt3mXtk9y1SS3SPLkqnpzkse11n65hL7/OMnL85vfy3v32+Oq6ndba59fXvVrp6oekOT1ueyShpPaHphV/M5Yi+cXAABgNQgaAQAAsFWpbimc9yTZpT91UZLjknyzP94z3R+9d0gXNvlsVe3XWjtxQpfX7/cXJzmh7+fsJJckuVaSOyS5bpJK8ifp/hD6pAX6OSJd2OUeSW7en/tYkm8s0PbYsZ9zld0k3R9xr5Juto9PJvlRuj+MXmZpoTV4f9fSbZK8KMmOSX6Q5DPpfr6bpgufVLqAyweq6lbpllZ6XX/vN9P9Hs5Pcqt0f4xNuj+W/0e6oMk0/j5bQkZfSfKlftzbJ5mbBWuvJB/v36fvL9RJH4D57yS/Nzh9ZpIj+/31khyUZLt0f9Q/vKqu2lo7bIoat0/y7r6mi5N8Nsm3+vO369u8I8nX0r0Pd+jPHZeFn9XPzXt9tWwJGf0sydfT/T7OSxfmuUnf7+Z0z89RVXW71topU9SeJP+abqawS5Mck+4ztSnJndI9g+l/jjckue+kTqpqmyRvT/KgwekLkxyd5Lvpnu/d0r1Pu/djLDgDU1U9NF14Y24mpV+le1++29d50yT79T/znZIcXVV3aK39dMqfeb6575ckefLg/P9b7Kat4LlaD2/PlqBRpfu+evf8RlW1c5IPJbnL4PS3knwhyVnpQkF3SXKddN8xr6yqXVprL5o0cFUdku45HM6wdXKS45P8PN336i36bVO679Elq6qn57Kzj52e7vn7cZLW137zdN89CwWRljrew9M973N9XZLk00lOSffZ3z/d+5Qkj0hyw6q6e2vt/Cn6fnSS1/QvT0ry+XSfp5une/8r3bP/7qraq7X285X+PKvozklemO73fUa6f1tPT/ffDrcdNlzt74y1eH4BAABWTWvNZrPZbDabzWaz2Wy2FW/p/rDd+u2Fy+xjtyQ/HfRzRJLdF2h37ST/M2j3lSTbTOjzxelmNNplwvVKcv90AYq5/u66SI2HD9o9esqf67uDe/acov3oGPPaXNTvX5XkSvPabZtk01q9v0v43R446K9N+Rydny4o8qS5n2HQ7oAk5w7aPi9dCOnnSR6yQL8PSxeWmGt/twnj7zloc2G/Pz3JvRZoe/9+vLn2H1zk53r28Ofvn8vtFnj+PzTv93rHCf09eoHf/5ELPV9Jth8cv3Bw31Sf0yR3TPK3SW65SJtrpQtgzPX90SmfhfP7/bFJbr7AZ/Pp8963BX9vffu/m9f2lUmuMaHtvv3zf4sFrt0iyS/7Pi5N8tIkV12g3Y2SfGow3vtX8hkZ9Dv6Odnanqsl/nxHLvUZnHf/TrnsZ/lFE9odMWhzUpIDF2izTZInDp7Di5PsN6G/26YLj8z1+cVF3sfdkvxZkmePPP9HLnB9c5LTBm2em2TbCeNcPcljFhqnv374oJ9HT2hz43TfnXPtjklyk3ltNiV5RroA0ly7V0zzDPfv7c+S/O4C7e6Wy36H/t/V+Ayt5Lmb1+6idN8BfzH/d5DLfq+u+nfGaj+/NpvNZrPZbDabzba
"text/plain": [
"<Figure size 2800x1200 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": []
},
{
"cell_type": "code",
"execution_count": 231,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src='hugetree.png'>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Random Forest\n",
"\n",
"**TASK: Create a Random Forest model and create a classification report and confusion matrix from its predicted results on the test set.**"
]
},
{
"cell_type": "code",
"execution_count": 259,
"metadata": {},
"outputs": [],
"source": [
"#CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 269,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" precision recall f1-score support\n",
"\n",
" No 0.86 0.89 0.87 557\n",
" Yes 0.52 0.44 0.48 147\n",
"\n",
" accuracy 0.80 704\n",
" macro avg 0.69 0.67 0.68 704\n",
"weighted avg 0.79 0.80 0.79 704\n",
"\n"
]
}
],
"source": []
},
{
"cell_type": "code",
"execution_count": 270,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<sklearn.metrics._plot.confusion_matrix.ConfusionMatrixDisplay at 0x2d1e6a54040>"
]
},
"execution_count": 270,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAUIAAAEGCAYAAAAQZJzmAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAd4UlEQVR4nO3de5he473/8fcn5xARkbCDBKngh02oOraq0W6CitqU0tYme6NFtX5tUftCdXNVi2i24peWStCDQ1uHxvmw0ZZISLOJIhXHhByFSERm5vv7Y90TT8Yc1kyeZ555Zn1e17WuWetep3syl6/7Xvda91cRgZlZkfWodgXMzKrNgdDMCs+B0MwKz4HQzArPgdDMCq9XtSvQXkMG94ythveudjWsHV6ctV61q2Dt9B5LF0XE0I6ef+Dn1o/FS+pzHTtj1qp7I+Kgjt6rHGouEG41vDfT7h1e7WpYOxy42ehqV8Ha6YG49dV1OX/xknqm3Tsi17E9h700ZF3uVQ41FwjNrOsLoIGGalcjNwdCMyu7IFgd+brGXYEDoZlVhFuEZlZoQVBfQ5/vOhCaWUU04EBoZgUWQL0DoZkVnVuEZlZoAaz2M0IzK7Ig3DU2s4ILqK+dOOhAaGbll31ZUjscCM2sAkQ9qnYlcnMgNLOyywZLHAjNrMCy9wgdCM2s4BrcIjSzInOL0MwKLxD1NZQJxIHQzCrCXWMzK7RAfBg9q12N3Gqn7WpmNSN7obpHriUvST0lPSPprrR9vaS5kmamZXQql6SJkuZImiVpt7au7RahmVVEBQZLzgCeBwaWlH0vIm5tctxYYFRa9gSuTj9b5BahmZVdhKiPHrmWPCRtARwC/DLH4eOAKZF5AhgkaVhrJzgQmllFNKBcCzBE0vSS5aRmLncF8H0+/gnzRan7O0FS31S2OfB6yTFvpLIWuWtsZmWXDZbkDi+LImL3lnZKOhRYEBEzJO1fsusc4C2gDzAJOAu4sCP1dSA0s7JrHCwpk32BwyQdDPQDBkq6MSK+mvavkvQr4Ltp+01geMn5W6SyFrlrbGYVUR/KtbQlIs6JiC0iYivgGOChiPhq43M/SQIOB55Np9wBfD2NHu8FLIuI+a3dwy1CMyu7Tvqy5CZJQwEBM4FTUvlU4GBgDrACOKGtCzkQmllFNOQcEW6PiHgEeCStj2nhmABObc91HQjNrOyySRdq58mbA6GZlV0gVtfQJ3YOhGZWdhHkflm6K3AgNLMKWPOydE1wIDSzsgvcIjQz82CJmRVbIE/MambFlqXzrJ3wUjs1NbMa4gTvZlZwQWW+LKkUB0Izqwi3CM2s0CLkFqGZFVs2WOJP7Mys0OQXqs2s2LLBEj8jNLOCq6UvS2qnpmZWMxq/LMmz5NVMgvetJT2ZErn/TlKfVN43bc9J+7dq69oOhGZWEQ30yLW0Q2OC90aXABMiYhtgKTA+lY8HlqbyCem4VjkQmlnZRcDqhh65ljyaJnhPCZvGALemQyaTJXCCLMH75LR+K3BAOr5FfkZoZmWXdY1zt7OGSJpesj0pIiY1OeYKsgTvG6TtjYF3IqIubZcmcV+T4D0i6iQtS8cvaqkCDoRmVhHt+LKkowney8aBsJPV18PpB23LxsNW86Mpc5n5+AB+ceFmrF4tRu28kjMve42eveD9d3twyWlbsmBeH+rr4MhTFnLgMUuqXf1Cm/zkbFYu70lDA9TXidPHbsvIHVZy+o/foP/6Dbz9Rh8uOXUEK5bXzovElVLm12c+luAd+BkwSFKv1CosTeLemOD9DUm9gA2Bxa3doFOeEUoKSZeVbH9X0gWdce+u5o+/HMrwUasAaGiAn54xgnOufpVJD7/AJpt/yP03DwbgjuuHMGLbD7jmgRf46W1zmHThZqz+sHbey+quvn/UJ/jmF7bj9LHbAvDtS1/nuouHccoB2/Hnuwdy5DcWVLmGXUXWNc6ztKWFBO/HAQ8DR6bDjgduT+t3pG3S/odSis8WddZgySrgCElDOul+XdLCeb2Z9uBAxh6b/c/p3aU96d0n2OITWWDc7bPv8fjUQQBIsPL9nkTAB+/3ZINB9fTs1erf0qpgi5Gr+N8n1gfgmUc34NOHLKtyjbqOhpS3pK1lHZwFnClpDtkzwGtT+bXAxqn8TODsti7UWYGwDpgEfKfpDklbSXpI0ixJD0oa0Ul16nTXnL85//6f81D6V99wcD31deLFv/UH4PG7BrFwXm8ADjthEa+91Jdjd92Rk8dsxzcufJMeHuOvrhAX/+ZlrrznRcYel/3P7NUX+7H3Qe8C8JlDlzF0s9XVrGGXkY0a98y1tO+68UhEHJrWX46IPSJim4g4KiJWpfIP0vY2af/LbV23M//T+jlwnKQNm5T/NzA5InYGbgImNj1R0kmSpkuavnBxfSdUtfyeuH8gg4bUMWrnlWvKJDjn6le45vzNOf3gUfQfUL8m2M14ZAM+seNKfv3Mc1x1/wv8/NzNef89R8JqOvPwbTjtwG0597itOezfFrHTnsu5/MzhfPH4RVx5z4v0H1BPnR9fAJV5obqSOm2wJCLelTQF+BawsmTX3sARaf0G4CfNnDuJrEXJ7rv0q8n+4eyn1ueJ+wby1IM78OEqseK9nlxy2gjOuvI1Lv/jHCALfm+83BeA+343mC+ftgAJNt/6Q/5pxIe8Pqcf2++6opq/RqEtfitrrS9b3Js/37Mh2++6gluv2YQffOUTAGw+chV7HvBuNavYpdRSOs/ObmJcQfbW9/qdfN+qO/EH87lpxmymTJvNOVe/yi6ffo+zrnyNdxZl/y/6cJW4+apNOPRrWZdr6OarmflY9srU0oW9eOMffRk2YlXV6l90ffvX03/9+jXrn/zse7zy935suHHWFZaCY894m7tu2Lia1ewyGkeN3SJsRkQskXQzWTC8LhX/hWwk6AbgOOCxzqxTtd1y1SY8+cBAogEOOX4xoz+9HIDjvv0Wl357BCeP2Y4IGH/ufDbcuDYfC3QHGw2t4/xrXwGgZ6/g4T9sxPRHBnL4+IV88d+y93T/fPeG3PfbwVWsZddSSxOzqo1R5fLcRFoeEQPS+qbAXOAnEXGBpC2BXwFDgIXACRHxWkvX2n2XfjHt3uEVr7OVz4Gbja52FaydHohbZ7T2knNbNtp+kxhz3ZFtHwj8ft+r1+le5dApLcLGIJjW3wbWK9l+leybQTPrRrpKtzcPf1liZmXniVnNzHAgNLOCa3yPsFY4EJpZRdTSe4QOhGZWdhFQl3PS1a7AgdDMKsJdYzMrND8jNDMDwoHQzIrOgyVmVmgRtfWMsHaGdcyshoj6hh65ljavJPWTNE3S3yQ9J+mHqfx6SXMlzUzL6FQuSRNTgvdZknZr6x5uEZpZRZTxGeEqYExELJfUG3hc0t1p3/ci4tYmx48FRqVlT+Dq9LNFDoRmVnbl/NY4JV5anjZ7p6W1abPGAVPSeU9IGiRpWETMb+kEd43NrPwie06YZyEleC9ZTmp6OUk9Jc0EFgD3R8STaddFqfs7QVLfVLYmwXtSmvy9WW4RmllFtGPUuNUE7wARUQ+MljQI+IOknYBzgLeAPmSpPM4CLuxIXd0iNLOyizIOlqx13Yh3yPIZHxQR8yOzimxy5z3SYY0J3huVJn9vlgOhmVVEO7rGrZI0NLUEkdQf+ALwd0nDUpmAw4Fn0yl3AF9Po8d7Actaez4I7hqbWYWUcdR4GDBZUk+yxtvNEXFXyoc+FBAwEzglHT8VOBiYA6wATmjrBg6EZlZ2WWuvbKPGs4BdmylvNsVHGi0+tT33cCA0s4qopS9LHAjNrCI6IUFm2TgQmlnZBaLBE7OaWdHVUIPQgdDMKqCMgyWdwYHQzCqjhpqEDoRmVhHdokUo6b9pJaZHxLcqUiMzq3kBNDR0g0AITO+0WphZ9xJAd2gRRsTk0m1J60XEispXycy6g1p6j7DNF30k7S1pNvD3tL2LpKsqXjMzq22Rc+kC8rzxeAVwILAYICL+BuxXwTqZWc0TEfmWriDXqHFEvJ7NdLNGfWWqY2bdRhdp7eWRJxC+LmkfIFLilDOA5ytbLTOraQFRQ6PGebr
"text/plain": [
"<Figure size 432x288 with 2 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Boosted Trees\n",
"\n",
"**TASK: Use AdaBoost or Gradient Boosting to create a model and report back the classification report and plot a confusion matrix for its predicted results**"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#CODE HERE"
]
},
{
"cell_type": "code",
"execution_count": 292,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" precision recall f1-score support\n",
"\n",
" No 0.88 0.90 0.89 557\n",
" Yes 0.60 0.54 0.57 147\n",
"\n",
" accuracy 0.83 704\n",
" macro avg 0.74 0.72 0.73 704\n",
"weighted avg 0.82 0.83 0.83 704\n",
"\n"
]
}
],
"source": []
},
{
"cell_type": "code",
"execution_count": 293,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<sklearn.metrics._plot.confusion_matrix.ConfusionMatrixDisplay at 0x2d1e9373a30>"
]
},
"execution_count": 293,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAUIAAAEGCAYAAAAQZJzmAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAd4UlEQVR4nO3de5he473/8fcn5xARkbCDBKngh02oOraq0W6CitqU0tYme6NFtX5tUftCdXNVi2i24peWStCDQ1uHxvmw0ZZISLOJIhXHhByFSERm5vv7Y90TT8Yc1kyeZ555Zn1e17WuWetep3syl6/7Xvda91cRgZlZkfWodgXMzKrNgdDMCs+B0MwKz4HQzArPgdDMCq9XtSvQXkMG94ythveudjWsHV6ctV61q2Dt9B5LF0XE0I6ef+Dn1o/FS+pzHTtj1qp7I+Kgjt6rHGouEG41vDfT7h1e7WpYOxy42ehqV8Ha6YG49dV1OX/xknqm3Tsi17E9h700ZF3uVQ41FwjNrOsLoIGGalcjNwdCMyu7IFgd+brGXYEDoZlVhFuEZlZoQVBfQ5/vOhCaWUU04EBoZgUWQL0DoZkVnVuEZlZoAaz2M0IzK7Ig3DU2s4ILqK+dOOhAaGbll31ZUjscCM2sAkQ9qnYlcnMgNLOyywZLHAjNrMCy9wgdCM2s4BrcIjSzInOL0MwKLxD1NZQJxIHQzCrCXWMzK7RAfBg9q12N3Gqn7WpmNSN7obpHriUvST0lPSPprrR9vaS5kmamZXQql6SJkuZImiVpt7au7RahmVVEBQZLzgCeBwaWlH0vIm5tctxYYFRa9gSuTj9b5BahmZVdhKiPHrmWPCRtARwC/DLH4eOAKZF5AhgkaVhrJzgQmllFNKBcCzBE0vSS5aRmLncF8H0+/gnzRan7O0FS31S2OfB6yTFvpLIWuWtsZmWXDZbkDi+LImL3lnZKOhRYEBEzJO1fsusc4C2gDzAJOAu4sCP1dSA0s7JrHCwpk32BwyQdDPQDBkq6MSK+mvavkvQr4Ltp+01geMn5W6SyFrlrbGYVUR/KtbQlIs6JiC0iYivgGOChiPhq43M/SQIOB55Np9wBfD2NHu8FLIuI+a3dwy1CMyu7Tvqy5CZJQwEBM4FTUvlU4GBgDrACOKGtCzkQmllFNOQcEW6PiHgEeCStj2nhmABObc91HQjNrOyySRdq58mbA6GZlV0gVtfQJ3YOhGZWdhHkflm6K3AgNLMKWPOydE1wIDSzsgvcIjQz82CJmRVbIE/MambFlqXzrJ3wUjs1NbMa4gTvZlZwQWW+LKkUB0Izqwi3CM2s0CLkFqGZFVs2WOJP7Mys0OQXqs2s2LLBEj8jNLOCq6UvS2qnpmZWMxq/LMmz5NVMgvetJT2ZErn/TlKfVN43bc9J+7dq69oOhGZWEQ30yLW0Q2OC90aXABMiYhtgKTA+lY8HlqbyCem4VjkQmlnZRcDqhh65ljyaJnhPCZvGALemQyaTJXCCLMH75LR+K3BAOr5FfkZoZmWXdY1zt7OGSJpesj0pIiY1OeYKsgTvG6TtjYF3IqIubZcmcV+T4D0i6iQtS8cvaqkCDoRmVhHt+LKkowney8aBsJPV18PpB23LxsNW86Mpc5n5+AB+ceFmrF4tRu28kjMve42eveD9d3twyWlbsmBeH+rr4MhTFnLgMUuqXf1Cm/zkbFYu70lDA9TXidPHbsvIHVZy+o/foP/6Dbz9Rh8uOXUEK5bXzovElVLm12c+luAd+BkwSFKv1CosTeLemOD9DUm9gA2Bxa3doFOeEUoKSZeVbH9X0gWdce+u5o+/HMrwUasAaGiAn54xgnOufpVJD7/AJpt/yP03DwbgjuuHMGLbD7jmgRf46W1zmHThZqz+sHbey+quvn/UJ/jmF7bj9LHbAvDtS1/nuouHccoB2/Hnuwdy5DcWVLmGXUXWNc6ztKWFBO/HAQ8DR6bDjgduT+t3pG3S/odSis8WddZgySrgCElDOul+XdLCeb2Z9uBAxh6b/c/p3aU96d0n2OITWWDc7bPv8fjUQQBIsPL9nkTAB+/3ZINB9fTs1erf0qpgi5Gr+N8n1gfgmUc34NOHLKtyjbqOhpS3pK1lHZwFnClpDtkzwGtT+bXAxqn8TODsti7UWYGwDpgEfKfpDklbSXpI0ixJD0oa0Ul16nTXnL85//6f81D6V99wcD31deLFv/UH4PG7BrFwXm8ADjthEa+91Jdjd92Rk8dsxzcufJMeHuOvrhAX/+ZlrrznRcYel/3P7NUX+7H3Qe8C8JlDlzF0s9XVrGGXkY0a98y1tO+68UhEHJrWX46IPSJim4g4KiJWpfIP0vY2af/LbV23M//T+jlwnKQNm5T/NzA5InYGbgImNj1R0kmSpkuavnBxfSdUtfyeuH8gg4bUMWrnlWvKJDjn6le45vzNOf3gUfQfUL8m2M14ZAM+seNKfv3Mc1x1/wv8/NzNef89R8JqOvPwbTjtwG0597itOezfFrHTnsu5/MzhfPH4RVx5z4v0H1BPnR9fAJV5obqSOm2wJCLelTQF+BawsmTX3sARaf0G4CfNnDuJrEXJ7rv0q8n+4eyn1ueJ+wby1IM78OEqseK9nlxy2gjOuvI1Lv/jHCALfm+83BeA+343mC+ftgAJNt/6Q/5pxIe8Pqcf2++6opq/RqEtfitrrS9b3Js/37Mh2++6gluv2YQffOUTAGw+chV7HvBuNavYpdRSOs/ObmJcQfbW9/qdfN+qO/EH87lpxmymTJvNOVe/yi6ffo+zrnyNdxZl/y/6cJW4+apNOPRrWZdr6OarmflY9srU0oW9eOMffRk2YlXV6l90ffvX03/9+jXrn/zse7zy935suHHWFZaCY894m7tu2Lia1ewyGkeN3SJsRkQskXQzWTC8LhX/hWwk6AbgOOCxzqxTtd1y1SY8+cBAogEOOX4xoz+9HIDjvv0Wl357BCeP2Y4IGH/ufDbcuDYfC3QHGw2t4/xrXwGgZ6/g4T9sxPRHBnL4+IV88d+y93T/fPeG3PfbwVWsZddSSxOzqo1R5fLcRFoeEQPS+qbAXOAnEXGBpC2BXwFDgIXACRHxWkvX2n2XfjHt3uEVr7OVz4Gbja52FaydHohbZ7T2knNbNtp+kxhz3ZFtHwj8ft+r1+le5dApLcLGIJjW3wbWK9l+leybQTPrRrpKtzcPf1liZmXniVnNzHAgNLOCa3yPsFY4EJpZRdTSe4QOhGZWdhFQl3PS1a7AgdDMKsJdYzMrND8jNDMDwoHQzIrOgyVmVmgRtfWMsHaGdcyshoj6hh65ljavJPWTNE3S3yQ9J+mHqfx6SXMlzUzL6FQuSRNTgvdZknZr6x5uEZpZRZTxGeEqYExELJfUG3hc0t1p3/ci4tYmx48FRqVlT+Dq9LNFDoRmVnbl/NY4JV5anjZ7p6W1abPGAVPSeU9IGiRpWETMb+kEd43NrPwie06YZyEleC9ZTmp6OUk9Jc0EFgD3R8STaddFqfs7QVLfVLYmwXtSmvy9WW4RmllFtGPUuNUE7wARUQ+MljQI+IOknYBzgLeAPmSpPM4CLuxIXd0iNLOyizIOlqx13Yh3yPIZHxQR8yOzimxy5z3SYY0J3huVJn9vlgOhmVVEO7rGrZI0NLUEkdQf+ALwd0nDUpmAw4Fn0yl3AF9Po8d7Actaez4I7hqbWYWUcdR4GDBZUk+yxtvNEXFXyoc+FBAwEzglHT8VOBiYA6wATmjrBg6EZlZ2WWuvbKPGs4BdmylvNsVHGi0+tT33cCA0s4qopS9LHAjNrCI6IUFm2TgQmlnZBaLBE7OaWdHVUIPQgdDMKqCMgyWdwYHQzCqjhpqEDoRmVhHdokUo6b9pJaZHxLcqUiMzq3kBNDR0g0AITO+0WphZ9xJAd2gRRsTk0m1J60XEispXycy6g1p6j7DNF30k7S1pNvD3tL2LpKsqXjMzq22Rc+kC8rzxeAVwILAYICL+BuxXwTqZWc0TEfmWriDXqHFEvJ7NdLNGfWWqY2bdRhdp7eWRJxC+LmkfIFLilDOA5ytbLTOraQFRQ6PGebr
"text/plain": [
"<Figure size 432x288 with 2 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**TASK: Analyze your results, which model performed best for you?**"
]
},
{
"cell_type": "code",
"execution_count": 294,
"metadata": {},
"outputs": [],
"source": [
"# With base models, we got best performance from an AdaBoostClassifier, but note, we didn't do any gridsearching AND most models performed about the same on the data set."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Great job!"
]
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
}
},
"nbformat": 4,
"nbformat_minor": 1
}