You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

2084 lines
1.6 MiB

2 years ago
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"___\n",
"\n",
"<a href='http://www.pieriandata.com'><img src='../Pierian_Data_Logo.png'/></a>\n",
"___\n",
"<center><em>Copyright by Pierian Data Inc.</em></center>\n",
"<center><em>For more information, visit us at <a href='http://www.pieriandata.com'>www.pieriandata.com</a></em></center>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# K-Means Clustering\n",
"\n",
"\n",
"Let's work through an example of unsupervised learning - clustering customer data.\n",
"\n",
"## Goal:\n",
"\n",
"When working with unsupervised learning methods, its usually important to lay out a general goal. In our case, let's attempt to find reasonable clusters of customers for marketing segmentation and study. What we end up doing with those clusters would depend **heavily** on the domain itself, in this case, marketing.\n",
"\n",
"----"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## The Data\n",
"\n",
"LINK: https://archive.ics.uci.edu/ml/datasets/bank+marketing\n",
"\n",
" This dataset is public available for research. The details are described in [Moro et al., 2011]. \n",
"\n",
"\n",
" [Moro et al., 2011] S. Moro, R. Laureano and P. Cortez. Using Data Mining for Bank Direct Marketing: An Application of the CRISP-DM Methodology. \n",
" In P. Novais et al. (Eds.), Proceedings of the European Simulation and Modelling Conference - ESM'2011, pp. 117-121, Guimarães, Portugal, October, 2011. EUROSIS.\n",
"\n",
" Available at: [pdf] http://hdl.handle.net/1822/14838\n",
" [bib] http://www3.dsi.uminho.pt/pcortez/bib/2011-esm-1.txt\n",
" For more information, read [Moro et al., 2011]."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" # bank client data:\n",
" 1 - age (numeric)\n",
" 2 - job : type of job (categorical: 'admin.','blue-collar','entrepreneur','housemaid','management','retired','self-employed','services','student','technician','unemployed','unknown')\n",
" 3 - marital : marital status (categorical: 'divorced','married','single','unknown'; note: 'divorced' means divorced or widowed)\n",
" 4 - education (categorical: 'basic.4y','basic.6y','basic.9y','high.school','illiterate','professional.course','university.degree','unknown')\n",
" 5 - default: has credit in default? (categorical: 'no','yes','unknown')\n",
" 6 - housing: has housing loan? (categorical: 'no','yes','unknown')\n",
" 7 - loan: has personal loan? (categorical: 'no','yes','unknown')\n",
" # related with the last contact of the current campaign:\n",
" 8 - contact: contact communication type (categorical: 'cellular','telephone')\n",
" 9 - month: last contact month of year (categorical: 'jan', 'feb', 'mar', ..., 'nov', 'dec')\n",
" 10 - day_of_week: last contact day of the week (categorical: 'mon','tue','wed','thu','fri')\n",
" 11 - duration: last contact duration, in seconds (numeric). Important note: this attribute highly affects the output target (e.g., if duration=0 then y='no'). Yet, the duration is not known before a call is performed. Also, after the end of the call y is obviously known. Thus, this input should only be included for benchmark purposes and should be discarded if the intention is to have a realistic predictive model.\n",
" # other attributes:\n",
" 12 - campaign: number of contacts performed during this campaign and for this client (numeric, includes last contact)\n",
" 13 - pdays: number of days that passed by after the client was last contacted from a previous campaign (numeric; 999 means client was not previously contacted)\n",
" 14 - previous: number of contacts performed before this campaign and for this client (numeric)\n",
" 15 - poutcome: outcome of the previous marketing campaign (categorical: 'failure','nonexistent','success')\n",
" # social and economic context attributes\n",
" 16 - emp.var.rate: employment variation rate - quarterly indicator (numeric)\n",
" 17 - cons.price.idx: consumer price index - monthly indicator (numeric)\n",
" 18 - cons.conf.idx: consumer confidence index - monthly indicator (numeric)\n",
" 19 - euribor3m: euribor 3 month rate - daily indicator (numeric)\n",
" 20 - nr.employed: number of employees - quarterly indicator (numeric)\n",
" 21 - y - has the client subscribed a term deposit? (binary: 'yes','no')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Imports"
]
},
{
"cell_type": "code",
"execution_count": 166,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt\n",
"import seaborn as sns"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Exploratory Data Analysis"
]
},
{
"cell_type": "code",
"execution_count": 167,
"metadata": {},
"outputs": [],
"source": [
"df = pd.read_csv(\"../DATA/bank-full.csv\")"
]
},
{
"cell_type": "code",
"execution_count": 168,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>age</th>\n",
" <th>job</th>\n",
" <th>marital</th>\n",
" <th>education</th>\n",
" <th>default</th>\n",
" <th>housing</th>\n",
" <th>loan</th>\n",
" <th>contact</th>\n",
" <th>month</th>\n",
" <th>day_of_week</th>\n",
" <th>...</th>\n",
" <th>campaign</th>\n",
" <th>pdays</th>\n",
" <th>previous</th>\n",
" <th>poutcome</th>\n",
" <th>emp.var.rate</th>\n",
" <th>cons.price.idx</th>\n",
" <th>cons.conf.idx</th>\n",
" <th>euribor3m</th>\n",
" <th>nr.employed</th>\n",
" <th>subscribed</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>56</td>\n",
" <td>housemaid</td>\n",
" <td>married</td>\n",
" <td>basic.4y</td>\n",
" <td>no</td>\n",
" <td>no</td>\n",
" <td>no</td>\n",
" <td>telephone</td>\n",
" <td>may</td>\n",
" <td>mon</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>999</td>\n",
" <td>0</td>\n",
" <td>nonexistent</td>\n",
" <td>1.1</td>\n",
" <td>93.994</td>\n",
" <td>-36.4</td>\n",
" <td>4.857</td>\n",
" <td>5191.0</td>\n",
" <td>no</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>57</td>\n",
" <td>services</td>\n",
" <td>married</td>\n",
" <td>high.school</td>\n",
" <td>unknown</td>\n",
" <td>no</td>\n",
" <td>no</td>\n",
" <td>telephone</td>\n",
" <td>may</td>\n",
" <td>mon</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>999</td>\n",
" <td>0</td>\n",
" <td>nonexistent</td>\n",
" <td>1.1</td>\n",
" <td>93.994</td>\n",
" <td>-36.4</td>\n",
" <td>4.857</td>\n",
" <td>5191.0</td>\n",
" <td>no</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>37</td>\n",
" <td>services</td>\n",
" <td>married</td>\n",
" <td>high.school</td>\n",
" <td>no</td>\n",
" <td>yes</td>\n",
" <td>no</td>\n",
" <td>telephone</td>\n",
" <td>may</td>\n",
" <td>mon</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>999</td>\n",
" <td>0</td>\n",
" <td>nonexistent</td>\n",
" <td>1.1</td>\n",
" <td>93.994</td>\n",
" <td>-36.4</td>\n",
" <td>4.857</td>\n",
" <td>5191.0</td>\n",
" <td>no</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>40</td>\n",
" <td>admin.</td>\n",
" <td>married</td>\n",
" <td>basic.6y</td>\n",
" <td>no</td>\n",
" <td>no</td>\n",
" <td>no</td>\n",
" <td>telephone</td>\n",
" <td>may</td>\n",
" <td>mon</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>999</td>\n",
" <td>0</td>\n",
" <td>nonexistent</td>\n",
" <td>1.1</td>\n",
" <td>93.994</td>\n",
" <td>-36.4</td>\n",
" <td>4.857</td>\n",
" <td>5191.0</td>\n",
" <td>no</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>56</td>\n",
" <td>services</td>\n",
" <td>married</td>\n",
" <td>high.school</td>\n",
" <td>no</td>\n",
" <td>no</td>\n",
" <td>yes</td>\n",
" <td>telephone</td>\n",
" <td>may</td>\n",
" <td>mon</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>999</td>\n",
" <td>0</td>\n",
" <td>nonexistent</td>\n",
" <td>1.1</td>\n",
" <td>93.994</td>\n",
" <td>-36.4</td>\n",
" <td>4.857</td>\n",
" <td>5191.0</td>\n",
" <td>no</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 21 columns</p>\n",
"</div>"
],
"text/plain": [
" age job marital education default housing loan contact \\\n",
"0 56 housemaid married basic.4y no no no telephone \n",
"1 57 services married high.school unknown no no telephone \n",
"2 37 services married high.school no yes no telephone \n",
"3 40 admin. married basic.6y no no no telephone \n",
"4 56 services married high.school no no yes telephone \n",
"\n",
" month day_of_week ... campaign pdays previous poutcome emp.var.rate \\\n",
"0 may mon ... 1 999 0 nonexistent 1.1 \n",
"1 may mon ... 1 999 0 nonexistent 1.1 \n",
"2 may mon ... 1 999 0 nonexistent 1.1 \n",
"3 may mon ... 1 999 0 nonexistent 1.1 \n",
"4 may mon ... 1 999 0 nonexistent 1.1 \n",
"\n",
" cons.price.idx cons.conf.idx euribor3m nr.employed subscribed \n",
"0 93.994 -36.4 4.857 5191.0 no \n",
"1 93.994 -36.4 4.857 5191.0 no \n",
"2 93.994 -36.4 4.857 5191.0 no \n",
"3 93.994 -36.4 4.857 5191.0 no \n",
"4 93.994 -36.4 4.857 5191.0 no \n",
"\n",
"[5 rows x 21 columns]"
]
},
"execution_count": 168,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": 169,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Index(['age', 'job', 'marital', 'education', 'default', 'housing', 'loan',\n",
" 'contact', 'month', 'day_of_week', 'duration', 'campaign', 'pdays',\n",
" 'previous', 'poutcome', 'emp.var.rate', 'cons.price.idx',\n",
" 'cons.conf.idx', 'euribor3m', 'nr.employed', 'subscribed'],\n",
" dtype='object')"
]
},
"execution_count": 169,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.columns"
]
},
{
"cell_type": "code",
"execution_count": 170,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"RangeIndex: 41188 entries, 0 to 41187\n",
"Data columns (total 21 columns):\n",
" # Column Non-Null Count Dtype \n",
"--- ------ -------------- ----- \n",
" 0 age 41188 non-null int64 \n",
" 1 job 41188 non-null object \n",
" 2 marital 41188 non-null object \n",
" 3 education 41188 non-null object \n",
" 4 default 41188 non-null object \n",
" 5 housing 41188 non-null object \n",
" 6 loan 41188 non-null object \n",
" 7 contact 41188 non-null object \n",
" 8 month 41188 non-null object \n",
" 9 day_of_week 41188 non-null object \n",
" 10 duration 41188 non-null int64 \n",
" 11 campaign 41188 non-null int64 \n",
" 12 pdays 41188 non-null int64 \n",
" 13 previous 41188 non-null int64 \n",
" 14 poutcome 41188 non-null object \n",
" 15 emp.var.rate 41188 non-null float64\n",
" 16 cons.price.idx 41188 non-null float64\n",
" 17 cons.conf.idx 41188 non-null float64\n",
" 18 euribor3m 41188 non-null float64\n",
" 19 nr.employed 41188 non-null float64\n",
" 20 subscribed 41188 non-null object \n",
"dtypes: float64(5), int64(5), object(11)\n",
"memory usage: 6.6+ MB\n"
]
}
],
"source": [
"df.info()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Continuous Feature Analysis"
]
},
{
"cell_type": "code",
"execution_count": 171,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:xlabel='age', ylabel='Count'>"
]
},
"execution_count": 171,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAB+0AAAQICAYAAADIsEkuAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAB7CAAAewgFu0HU+AACAEElEQVR4nOzdabhmV1kn/P9dFqQyFUMFTiiSFiEGwmBbCgE7UaC1oRE6BYpKo0CQKa+AokCU1m5b2uG1aFoUG2VqArZDUAkpEBq7aWYRAu9xQAgRBEkg9WSgpIokVSSp9X44u8zD4Yz1POesSp3f77r2tdd69tr3vit843/22tVaCwAAAAAAAACw/jb1bgAAAAAAAAAANiqhPQAAAAAAAAB0IrQHAAAAAAAAgE6E9gAAAAAAAADQidAeAAAAAAAAADoR2gMAAAAAAABAJ0J7AAAAAAAAAOhEaA8AAAAAAAAAnQjtAQAAAAAAAKAToT0AAAAAAAAAdCK0BwAAAAAAAIBOhPYAAAAAAAAA0InQHgAAAAAAAAA6EdoDAAAAAAAAQCdCewAAAAAAAADoRGgPAAAAAAAAAJ1s7t0Ax46qOi7Jg4bptUlu7dgOAAAAAAAAwDR9U5K7DeO/ba0dnEZRoT3T9KAkl/VuAgAAAAAAAGCNPSTJx6ZRyPb4AAAAAAAAANCJN+2ZpmsPDz760Y/mHve4R89eAAAAAAAAAKbm6quvztlnn314eu1Sa1dDaM80/fM37O9xj3vktNNO69kLAAAAAAAAwFq5dfklK2N7fAAAAAAAAADoRGgPAAAAAAAAAJ0I7QEAAAAAAACgE6E9AAAAAAAAAHQitAcAAAAAAACAToT2AAAAAAAAANCJ0B4AAAAAAAAAOhHaAwAAAAAAAEAnGza0r6oHV9V/qqo/r6qrqupgVX21qq6oqjdU1bmrrPeYqrpkrNZVw/wxq6ixuaouqKoPVNW1VXVTVX22ql5dVQ9YRZ1TquqlVfU3VbVvOP5m+G3bav5dAAAAAAAAAKydaq317mHdVdX7k3z3Cpa+KcmzWmtfW6LWpiSvSfKMJeq8LslzWmuHlqhzSpJ3JHnIIksOJnlea+11SzVcVQ9N8tYkpy6y5Ookj2+tfXSpOkeiqk5LcmWSXHnllTnttNOm/QgAAAAAAACALq666qqcfvrph6ent9aumkbdjfqm/fbh/KUkv5nkiUnOTvJdSX4myReH609NctEytX4ltwX2s0n+/VDr3w/zJHlmkl9erEBVfVOSS3JbYP+WJI9J8tAkP5nkmiTHJXn1Um/uV9XpSd6WucD+liS7knzPcOwafrtHkrcNATsAAAAAAAAAHW3UN+3fnrm36P+0tXbrAtdPSfKhJGcOPz28tfb+BdadmeTvkmxO8rEk39Nau2ns+glJ3pfkwZkLzM9qrX1mgTo/nuT1w/RVrbXnzrt+RpKPJ9ma5DNDnVsWqPOmJE8Zpj/cWvvjedd/OMnFw/SNrbXz59eYhDftAQAAAAAAgGOVN+2nqLX2uNbamxcK7Ifr1yV54dhPT1yk1AsyF9gnyfPHA/uhzo1Jnj9MNyf56UXqvGg4fznJixfo5zNJfm2YnpHkCfPXVNWpSX50mL5rfmA/1HlzkncN06cM9wAAAAAAAADQyYYM7VfoPWPj+8y/WFWVZOcwvby19pcLFRl+//Qw3TncN17nzCRnDdM3D0H/Qi4aG39DaJ/kvNz2v+cbFqkxXmfTcA8AAAAAAAAAnQjtF3fc2HihN/K/Jcn2Yfy+ZWodvn7PJPead+3cBdZ9g9baniRXDNNzFliyojrzri1UBwAAAAAAAIB1snn5JRvWw8fGn1rg+v3HxpcvU2v8+llJPjdBnTOTnF5VJ7bWbligzleGgH9BrbWrq2pfkq257Q3/FRm+Wb8U2+0DAAAAAAAArILQfgFVtSnJz4399OYFlo0H2FctU/LKsfHpU6hTw32fHrt2uM5yNQ7XecACvazkPgAAAAAAAACmxPb4C/vpJGcP47e01j6+wJqTx8ZfXabe+BvxJ61xneVqjNeZXwMAAAAAAACAdeRN+3mq6uFJ/t9hek2S/2eRpVvGxl9bpuzBsfHxa1xnuRrjdebXWM5yb+afmuSyVdYEAAAAAAAA2LCE9mOq6gFJLsncf5cDSX6otXbNIssPjI3vuEzp48bGNy1T50AWt1ydE1bQy3id+TWW1Fpbcuv9qlpNOQAAAAAAAIANz/b4g6r6liR/nuQuSW5N8qTW2vuXuGX/2Hi5beZPHBvP375+2nVWsuX94Tor2UofAAAAAAAAgDUitE9SVduT/J8k25O0JD/eWrt0mdvG3zo/bZm149vKXzmFOm3efeN1lqsxXmd+LwAAAAAAAACsow0f2lfVKUn+d5J7Dz89v7X2phXc+smx8f2WWTt+/VNTqHNla+2GRercqapOXaxAVd0jydZFegEAAAAAAABgHW3o0L6q7pTkXUnuP/z0c621/77C2z+X5EvD+OHLrP2e4fzFJJ+fd+2DY+NF6wxB/JnD9EMLLFlRnXnXFqoDAAAAAAAAwDrZsKF9VZ2Q5M+SfMfw06+01n59pfe31lqSw1vo36+qHrbIcx6W296Qv3S4b7zOFbntjfcfHvpayPlj40sWuL47yaFh/PQlWj9c59BwDwAAAAAAAACdbMjQvqrumLng+5zhp99srf3CEZR6RZJbh/Erq+r4ec85Pskrh+ktw/qF/NfhfNckuxbo9z5JXjJMP5MFQvvW2p4kvz9MH11VT1ygzg8lefQw/b3hHgAAAAAAAAA62dy7gU7+MMmjhvH/TfL6qnrgEuu/NrwR/3Vaa1dU1cuS/FySByf5UFX9epLPJrlPkp9NsmNY/rLW2t8vUv+NSX48c39E8NxhK/zXJtmb5Owk/zFz36E/lOQnW2u3LFLn55P82yR3S/KHVfXgJG8frj0uyQuH8bVJjuSPFAAAAAAAAACYopq3W/uGUFWr/Uf/Y2vtXovU2pS5gP3Hl7j/9Ume3Vo7tNiCqjolyTuSPGSRJQeTPK+19rqlGq2qhyZ5a5JTF1myJ8njW2sfWarOkaiq05JcmSRXXnllTjvttGk/AgAAAAAAAKCLq666Kqeffvrh6emttaumUXdDbo8/Ta21Q621ZyR5bOa+cf+lJF8bzpcm+f7W2jOXCuyHOtcl+VdJfiLJB5Ncn+RAkn/I3B8FfOdygf1Q5yNJHpTkl5N8IslXh+Nvh98euBaBPQAAAAAAAACrtyHftGdteNMeAAAAAAAAOFZ50x4AAAAAAAAAjjFCewAAAAAAAADoRGgPAAAAAAAAAJ0I7QEAAAAAAACgE6E9AAAAAAAAAHQitAcAAAAAAACAToT2AAAAAAAAANCJ0B4AAAAAAAAAOhHaAwAAAAAAAEAnQnsAAAAAAAAA6GRz7wYAgKPTjh07MhqNplJrZmYms7OzU6kFAAAAAADHEqE9ALCg0WiUq/fsyZat2yaqc2Df9VPqCAAAAAAAjj1CewBgUVu2bsvOXbsnqnHphedNqRsAAAAAADj2+KY9AAAAAAAAAHQitAcAAAAAAACAToT2AAAAAAAAANCJ0B4AAAAAAAAAOhHaAwAAAAAAAEAnQnsAAAAAAAAA6ERoDwAAAAAAAACdCO0BAAAAAAAAoBOhPQAAAAAAAAB0IrQHAAAAAAAAgE6E9gAAAAAAAADQidAeAAAAAAAAADoR2gMAAAAAAABAJ0J7AAAAAAAAAOhEaA8AAAAAAAAAnQjtAQAAAAAAAKAToT0AAAAAAAAAdCK0BwAAAAAAAIBOhPYAAAAAAAAA0InQHgAAAAAAAAA6EdoDAAAAAAAAQCdCewAAAAAAAADoRGgPAAAAAAAAAJ0I7QEAAAAAAACgE6E9AAAAAAAAAHQitAcAAAAAAACAToT2AAAAAAAAANCJ0B4AAAAAAAAAOhHaAwAAAAAAAEAnQnsAAAAAAAAA6ERoDwAAAAAAAACdCO0BAAAAAAAAoBOhPQAAAAAAAAB0IrQHAAAAAAAAgE6E9gAAAAAAAADQidAeAAAAAAAAADoR2gMAAAAAAABAJ0J7AAAAAAAAAOhEaA8
"text/plain": [
"<Figure size 2400x1200 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"plt.figure(figsize=(12,6),dpi=200)\n",
"sns.histplot(data=df,x='age')"
]
},
{
"cell_type": "code",
"execution_count": 172,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:xlabel='age', ylabel='Count'>"
]
},
"execution_count": 172,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAB+0AAAQICAYAAADIsEkuAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAB7CAAAewgFu0HU+AACn70lEQVR4nOzdebhdVX0//vdKQgKBBBMDiSFUUEEUtUZAUKLQ4gxURVGLRcMgtHUuzmOUOhTEEacik19sUVFQFCpOIJMyGH+CqIiiMuUyJGaEhOSu3x/33HByuUOSO+wk9/V6nvOctfde+7M/J5Z/+r5r7VJrDQAAAAAAAAAw8sY03QAAAAAAAAAAjFZCewAAAAAAAABoiNAeAAAAAAAAABoitAcAAAAAAACAhgjtAQAAAAAAAKAhQnsAAAAAAAAAaIjQHgAAAAAAAAAaIrQHAAAAAAAAgIYI7QEAAAAAAACgIUJ7AAAAAAAAAGiI0B4AAAAAAAAAGiK0BwAAAAAAAICGCO0BAAAAAAAAoCFCewAAAAAAAABoiNAeAAAAAAAAABoitAcAAAAAAACAhoxrugG2HKWUCUme3Dq8J8maBtsBAAAAAAAAGEpjk+zQGt9Qa105FEWF9gylJye5tukmAAAAAAAAAIbZPkmuG4pCtscHAAAAAAAAgIZYac9Quqd7cM011+RRj3pUk70AAAAAAAAADJm77rorT3/607sP7+lv7oYQ2jOU1r7D/lGPelRmzZrVZC8AAAAAAAAAw2XNwFPWj+3xAQAAAAAAAKAhQnsAAAAAAAAAaIjQHgAAAAAAAAAaIrQHAAAAAAAAgIYI7QEAAAAAAACgIUJ7AAAAAAAAAGiI0B4AAAAAAAAAGiK0BwAAAAAAAICGCO0BAAAAAAAAoCFCewAAAAAAAABoyLimGwAAAAAAAIDRqrOzM8uWLcuSJUuyatWqrFmzpumWYIs1duzYjB8/PpMnT852222XMWM2jTXuQnsAAAAAAABowNKlS3PHHXek1tp0KzAqrF69OitXrszSpUtTSslOO+2USZMmNd2W0B4AAAAAAABGWm+BfSklY8eObbAr2LKtWbNm7X9ztdbccccdm0RwL7QHAAAAAACAEdTZ2blOYL/ddttl6tSpmThxYkopDXcHW65aa1asWJGFCxdm2bJla4P73XffvdGt8jeNTfoBAAAAAABglOgOC5OuwH7WrFnZdtttBfYwzEop2XbbbTNr1qxst912SbqC/GXLljXal9AeAAAAAAAARtCSJUvWjqdOnSqshxFWSsnUqVPXHrf/N9kEoT0AAAAAAACMoFWrViXpCg4nTpzYcDcwOrW/jqL7v8mmCO0BAAAAAABgBK1ZsyZJMnbsWKvsoSGllIwdOzbJQ/9NNkVoDwAAAAAAAAANEdoDAAAAAAAAQEOE9gAAAAAAAADQEKE9AAAAAAAAADREaA8AAAAAAAAADRHaAwAAAAAAAFuEs846K6WUlFLy5z//uel2YL0I7QEAAAAAAACgIUJ7AAAAAAAAAGiI0B4AAAAAAAAAGiK0BwAAAAAAAICGCO0BAAAAAAAAoCFCewAAAAAAAGDUuOeee/K+970vs2fPziMe8YhsvfXW2WWXXXLkkUfmiiuu6PfeVatW5cILL8wb3vCG7LPPPpkyZUq22mqrPPKRj8y+++6befPm5d577+23xi677JJSSubOnZsk+f3vf5/Xve512WWXXTJhwoRMnz49L33pS/Pzn/98qH4ym7hxTTcAAAAAAAAAMBIuueSSHH744VmyZMk65//yl7/kL3/5S84555y8/vWvz2c/+9mMGfPw9c/HHXdczj777IedX7hwYa655ppcc801OfXUU/Od73wn+++//4D9nH/++fmXf/mXrFixYu25u+++OxdccEEuvPDCfO1rX8srX/nKjfilbE6stAcAAAAAAAC2eL/61a9y6KGHZsmSJdlqq63y1re+NT/96U9zzTXX5Mtf/nJ23XXXJMnnP//5vPvd7+61xurVq/OYxzwmJ5xwQr7+9a/n6quvzrXXXpvzzjsv//qv/5rx48fnvvvuy0tf+tLcfffd/fZzww035Igjjsj06dNz6qmn5uc//3muvvrqzJs3L1tvvXXWrFmT4447Lvfcc8+Q/1uwabHSHgAAAAAAANjiHXfccVm1alXGjh2b733ve3ne85639to+++yTww8/PHPmzMlNN92UT3ziE3nNa16TPffcc50aH/rQh/KYxzwmpZR1zu+999552cteln//93/PM5/5zNxzzz353Oc+lxNPPLHPfn75y19mr732yk9+8pNMnjx57fn99tsvj3vc4/Iv//IvWbJkSc4555y89a1vHaJ/BTZFVtoDAAAAAAAAW7Rrrrkm1157bZLkda973TqBfbcpU6bkv//7v5MknZ2d+cIXvvCwOY997GMfFti3e/KTn5xjjz02SXLBBRcM2NcZZ5yxTmDf7YgjjsjMmTOTJJdffvmAddi8WWkPAAAAAAAAbNF+9KMfrR0fc8wxfc7bf//984QnPCG//e1v17mnL4sWLcrChQvzwAMPpNaaJHnEIx6RJLnpppvy4IMPZquttur13ic/+cl5ylOe0uu1Ukpmz56dO++8M3/6058G7IPNm9AeAAAAAAAA2KLdeOONSZLx48fnqU99ar9z99133/z2t7/NH/7wh6xatSrjx49f5/oNN9yQT33qU7n44ouzYMGCPut0dnZm0aJF2XHHHXu9vscee/Tbx9SpU5MkS5cu7Xcemz+hPQAAAAAAALBFW7hwYZKuIHzcuP4j0hkzZiRJaq1ZtGhRpk+fvvba6aefnn/913/N6tWr1+u5999/f5/XJk6c2O+9Y8Z0vel8zZo16/UsNl/eaQ8AAAAAAACMCv29j34gv/vd79YG9jvuuGNOPvnkXH/99bnvvvuyatWq1FpTa83pp5++9p7uLfOhP1baAwAAAAAAAFu07q3m77vvvqxevbrf1fbdW96XUjJlypS1588666ysXr06Y8eOzWWXXdbn9vbdq/phfVlpDwAAAAAAAGzRnvSkJyVJVq1alV/96lf9zr3mmmuSJLvttts677P/zW9+kyT5+7//+37fR3/dddcNsltGGyvtAYBezZ49Ox0dHUNSa/r06Zk/f/6Q1AIAAAAA2FDPec5z8t73vjdJcsYZZ2Tvvffudd7VV1+dm266ae097brfY798+fI+n3PXXXflu9/97lC0zChipT0A0KuOjo4sWLAgy1euHtRnwYIFQxb+AwAAAABsjKc//elrg/rTTjstP/7xjx82Z/HixTn++OOTJGPGjMm//du/rXN9t912S5L84Q9/yFVXXfWw+1esWJEjjjgi999//1C3zxbOSnsAoE+TpkzLvHOvGFSNea+aM0TdAAAAAABsvNNOOy377rtvVq1alRe96EV54xvfmEMPPTTbbrtt5s+fn49//OP505/+lCR529vetnZL/W5HHnlkPve5z6WzszMHH3xw3v72t2fOnDnZeuutc/311+dTn/pU/vCHP2T//ffPlVde2cRPZDMltAcAAAAAAAC2eE996lNz4YUX5vDDD8+SJUtyyimn5JRTTnnYvNe//vX52Mc+9rDz++yzTz70oQ/lgx/8YP72t7+t3W6/3QknnJAnPelJQns2iO3xAQAAAAAAgFHhec97Xm655Za85z3vyVOf+tRMnjw5EyZMyN/93d/l1a9+dS6//PKceuqpGTOm9xj1Ax/4QL7//e/nec97XqZMmZLx48dn1qxZOeyww3LJJZfkE5/4xAj/IrYEpdbadA9sIUops5LcliS33XZbZs2a1XBHAAzGzJkzs3zl6iHZHn/bCeNy5513DlFnAAAAALB5+8Mf/pDVq1dn3Lhxa9+TDoy8Df1v8fbbb8/OO+/cfbhzrfX2oejDSnsAAAAAAAAAaIjQHgAAAAAAAAAaMmpD+1LKjqWUQ0opHy6lXFxKubeUUlufszay5nNKKWeVUm4ppSwvpSwupdxcSjmvlPJvpZTtBrh/YinlHaWUa0spC1s1fldKOaWU8ugN6OPRrXt+16qxsFXz7aWUiRvz2wA
"text/plain": [
"<Figure size 2400x1200 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"plt.figure(figsize=(12,6),dpi=200)\n",
"sns.histplot(data=df,x='age',hue='loan')"
]
},
{
"cell_type": "code",
"execution_count": 174,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:xlabel='pdays', ylabel='Count'>"
]
},
"execution_count": 174,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAB/8AAAQICAYAAADr6xj7AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAB7CAAAewgFu0HU+AACBz0lEQVR4nOzdeby1VV03/s8XbhkVJ0xEUBNEcShNxAFNrR4t0UJzNgsVyZxN0ax8NNOcojQt59lSsQSV8ME0IacE+WVqigROoCIgKsgosH5/7HW8t9s9nMN9n3O4OO/363W99lp7reu71j73/d9nX2tXay0AAAAAAAAAwHBts94bAAAAAAAAAAC2jPAfAAAAAAAAAAZO+A8AAAAAAAAAAyf8BwAAAAAAAICBE/4DAAAAAAAAwMAJ/wEAAAAAAABg4IT/AAAAAAAAADBwwn8AAAAAAAAAGDjhPwAAAAAAAAAMnPAfAAAAAAAAAAZO+A8AAAAAAAAAAyf8BwAAAAAAAICBE/4DAAAAAAAAwMAJ/wEAAAAAAABg4IT/AAAAAAAAADBwwn8AAAAAAAAAGLhN670BmFRV2ye5Xe+eneTyddwOAAAAAAAAwNa0bZIb9PYXW2uXbI2iwn+uim6X5MT13gQAAAAAAADAKrtTks9tjUKO/QcAAAAAAACAgfPkP1dFZy81TjjhhNzoRjdaz70AAAAAAAAAbDXf/e53s//++y91z543dyWE/1wVXb7UuNGNbpQ99thjPfcCAAAAAAAAsFouXzxleRz7DwAAAAAAAAADJ/wHAAAAAAAAgIET/gMAAAAAAADAwAn/AQAAAAAAAGDghP8AAAAAAAAAMHDCfwAAAAAAAAAYOOE/AAAAAAAAAAyc8B8AAAAAAAAABk74DwAAAAAAAAADJ/wHAAAAAAAAgIET/gMAAAAAAADAwAn/AQAAAAAAAGDghP8AAAAAAAAAMHDCfwAAAAAAAAAYOOE/AAAAAAAAAAyc8B8AAAAAAAAABk74DwAAAAAAAAADJ/wHAAAAAAAAgIET/gMAAAAAAADAwAn/AQAAAAAAAGDghP8AAAAAAAAAMHDCfwAAAAAAAAAYOOE/AAAAAAAAAAyc8B8AAAAAAAAABk74P0VVvayq2th1r2Xc81tVdWRVnVFVl/TXI6vqt1aw7qaqekJVfaKqzq6qi6rqtKp6fVXdZgV1dq2qF1bVF6rqvH59ob93/RXUuW1f+7S+l7P73p5QVZuWWwcAAAAAAACA1VWttfXew1VKVd0+yYlJxsPte7fWjpsxf5skb0jyuDll35TkD1trV8xZd9ckxyS504wplyR5cmvtTXPWSVXdOclRSXabMeW7SQ5qrZ2woM7jk7wmyXYzppyQ5MDW2jnz6lwZVbVHktOT5PTTT88ee+yxtZcAAAAAAAAAWBdnnHFG9txzz6Xunq21M7ZGXU/+jxkL8jclOWuZt704m4P//0ryiCT799f/6u8fkuRFc9bdNsmR2Rz8vz/JbyW5c5Kn9r1sn+T1804SqKo9k3woo+D/siQvT/Kr/Xp5f+9GST7UA/ZZde6X5HUZBf/f63u4c9/T+/u0/ZMc2fcOAAAAAAAAwDry5P+Yqnp6kr9NcnJGYfxz+9DUJ/+rap8k/5PRlwU+l+RXW2sXjY3vlOT4JPtlFLzv21o7dUqdxyZ5c+/+Q2vtSRPjeyc5KckuSU7tdS6bUucdSR7duw9trb1vYvyhSd7bu29vrR08pcY1+ue/eZLzkvxKa+20iTl/n+SJvfuY1trbJutsCU/+AwAAAAAAAFdXnvxfZVV1kyR/2btPSHLpMm57ejb/PMBTxoP/JGmtXZjkKb27KckzZtR5Vn89N8lhk4P9CwMv6d29kzxwyv53S/Ko3j12MvjvdY5IcmzvPrrfM+mBGQX/SfKSyeC/OyzJD8baAAAAAAAAAKwj4f9mf5/kmhk9EX/8oslVVUl+p3dPbq3957R5/f2v9u7v9PvG6+yTZN/ePaJ/YWCat421fy78T/Lb2fzv+dY5W1+qs02/Z9JBM9b8qb7HI3r31v0zAAAAAAAAALBOhP/56XH498/oyftnLZi+5BeT7N7bi74ssDR+4yQ3mxi7+5R5P6e1dmaSU3r3gClTllVnYmxena/2Na9sHQAAAAAAAADWyKbFU67equo6SV7Vu89prZ2zzFtvPdY+ecHc8fF9k3x9C+rsk2TPqtq5tXbBlDo/mhfat9a+W1XnJdklm08cSJJU1TWTLP24xEo/07JV1R4Lpkz7OQIAAAAAAAAAZtjw4X+Sl2cUNn8qyZtXcN94gH3Ggrmnj7X3nBi7MnWq3/fVsbGlOotqLNW5zVbaS6bUWc76AAAAAAAAAGwlG/rY/6q6R5JDklyW5AmttbaC26811v7xgrnjT+hfc5XrLKoxXme19gIAAAAAAADAGtqwT/5X1XZJ3pDRU/R/21r70gpL7DDWvnTB3EvG2juucp1FNcbrrNZeFll0UsBuSU5cYU0AAAAAAACADWvDhv9J/jTJrZJ8K8lfXIn7Lx5rb7dg7vZj7YsW1Lk4sy2qs9My9jJeZ9FellNjWp25Wmtzf1KgqlZSDgAAAAAAAGDD25DH/lfVrZI8t3ef0lq7YN78Gc4fay869n7nsfbkcfpbu85yjuBfqrNaewEAAAAAAABgDW3UJ/+fkdGT7V9LslNVPXzKnNuOtX+tqnbr7Q/1LwuMP72+x4L1xo+5P31ibLLOOcuo0ybuW6pzw2XsZbzO5F6+PbGX5dSYVgcAAAAAAACANbRRw/+lI+tvnuTdy5j/vLH2Lya5IMmXx9671YL7x8e/MjE2Wefzy6hz+pTTCr6c5I5Jrl1Vu7XWzpxWoKpulGSXaXtprZ1fVadnFOxvyWcCAAAAAAAAYA1tyGP/t5KvJ/lOb99zwdxf7a/fTvKNibFPjrVn1uknD+zTu5+aMmVZdSbG5tW55dhpB1emDgAAAAAAAABrZEM++d9aOzjJwfPmVNULkjy/d+/dWjtuokarqg8k+aMkt6qqu7TW/nNKnbtk81PyH2ittYk6p1TVV5Lsm+ShVfXM1tqFU7Y0vt8jp4x/MMlrM/pCx2OSvHfGR1uqc0W/Z9JRSR4xNvelkxOqaqckD+3dL7fWTpmxFgAAAAAAAFwp++23X848c+ph16yj3XbbLZ/73OfWextMsSHD/63olUkOTbJtkldX1a+21i5aGqyqHZO8uncv6/On+eskb05yvSQvT/Lk8cGq2ivJc3v31EwJ/1trZ1bVPyZ5dJL7VtWDW2v/PFHnIUnu27vvnPHTAEcm+VpGP4nw3Kp6X2vttIk5r0hy3bE2AAAAAAAAbFVnnnlmvv3tb6/3NmAwhP9boD+1/4okf5JkvySfqqqXJTktyV5JnpPkDn36K1pr/zuj1NuTPDbJAUme1I/bf2OSHyTZP8nzkuyS0dP6T22tXTajzp8l+c0kN0jy7qraL8nRfez+SZ7Z22cn+fMZn+knVfWUJB/qa36qql6U5ISMAv/HJ/ndPv2TSd45Yy8AAAAAAACwxaq2yQ7Xvv56b2PDu/hH309rV6z3NphD+L/l/izJL2QU3t8hyXumzHlzZoTtSdJau7yqDkpyTJI7ZRSu/+7EtEuSPLm19uE5dU6vqgdkdHT/bhl9+eA5E9POTHJQa+2MOXWOqaonJHlNkhtm8+kF405I8sDW2uWz6gAAAAAAAMCW2uHa189vv+wD672NDe+Dz/mdXPTDs9d7G8yxzXpvYOhaa1e01h6X5MAkH0jynSSX9tcPJLlfa+2QtuBrMK21c5LcLckTM3qi/vtJLs7oCP43Jrlja+1Ny9jPZ5PcLsmLknwpyY/79cX+3m37nEV13pjkjn3tr/W9fL/v7Y+SHND3DAAAAAAAAMA68+T/DK21FyR5wQrmH5PRk/tbsuZlSV7bry2pc05GPxXwvC2s86Ukh25JDQAAAAAAAABWnyf/AQAAAAAAAGDghP8AAAAAAAAAMHDCfwAAAAAAAAAYOOE/AAAAAAAAAAyc8B8AAAAAAAAABk74DwAAAAAAAAADJ/wHAAAAAAAAgIET/gMAAAAAAADAwAn
"text/plain": [
"<Figure size 2400x1200 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"plt.figure(figsize=(12,6),dpi=200)\n",
"sns.histplot(data=df,x='pdays')"
]
},
{
"cell_type": "code",
"execution_count": 175,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:xlabel='pdays', ylabel='Count'>"
]
},
"execution_count": 175,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAB9sAAAQICAYAAACtXbtRAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAB7CAAAewgFu0HU+AABbwUlEQVR4nOzde5hld13n+883tCQhkCBhsIGA3MxwPcrYRDRcvDA4ASV4GRRnjkRgwOEmOGBw1IPX0YiOShQPGBCceYQHHUKiBwa8IkSRNAfn6CAEECRp0pAAQQjpxKR/549aTTZFVe36dtWunaZer+fZz1prr9/67V8lf757rVVjjAAAAAAAAAAAm3fcshcAAAAAAAAAAMcasR0AAAAAAAAAmsR2AAAAAAAAAGgS2wEAAAAAAACgSWwHAAAAAAAAgCaxHQAAAAAAAACaxHYAAAAAAAAAaBLbAQAAAAAAAKBJbAcAAAAAAACAJrEdAAAAAAAAAJrEdgAAAAAAAABoEtsBAAAAAAAAoElsBwAAAAAAAIAmsR0AAAAAAAAAmsR2AAAAAAAAAGgS2wEAAAAAAACgac+yF8AtQ1Udn+RB0+FVSW5a4nIAAAAAAAAAttOtkvyLaf9vxxjXb3VCsZ0jHpTk0mUvAgAAAAAAAGDBHpJk/1Yn8Rh5AAAAAAAAAGhyZztHXHVk553vfGfufOc7L3MtAAAAAAAAANvmyiuvzBlnnHHk8KqNxm6W2M4Rn39H+53vfOecdtppy1wLAAAAAAAAwKLcNH/IfB4jDwAAAAAAAABNYjsAAAAAAAAANIntAAAAAAAAANAktgMAAAAAAABAk9gOAAAAAAAAAE1iOwAAAAAAAAA0ie0AAAAAAAAA0CS2AwAAAAAAAECT2A4AAAAAAAAATWI7AAAAAAAAADSJ7QAAAAAAAADQJLYDAAAAAAAAQJPYDgAAAAAAAABNYjsAAAAAAAAANIntAAAAAAAAANAktgMAAAAAAABAk9gOAAAAAAAAAE1iOwAAAAAAAAA0ie0AAAAAAAAA0CS2AwAAAAAAAECT2A4AAAAAAAAATWI7AAAAAAAAADSJ7QAAAAAAAADQJLYDAAAAAAAAQJPYDgAAAAAAAABNYjsAAAAAAAAANIntAAAAAAAAANAktgMAAAAAAABAk9gOAAAAAAAAAE1iOwAAAAAAAAA0ie0AAAAAAAAA0CS2AwAAAAAAAECT2A4AAAAAAAAATWI7AAAAAAAAADSJ7QAAAAAAAADQJLYDAAAAAAAAQJPYDgAAAAAAAABNYjsAAAAAAAAANIntAAAAAAAAANAktgMAAAAAAABA055lLwCAW459+/bl4MGDy15Gy969e7N///5lLwMAAAAAANhlxHYAPu/gwYM5cODAspcBAAAAAABwiye2A/BFqo7LCaecuuxlbOjQpz+RMQ4vexkAAAAAAMAuJbYD8EVOOOXUPO68i5a9jA1dfO7Zue6aq5a9DAAAAAAAYJc6btkLAAAAAAAAAIBjjdgOAAAAAAAAAE1iOwAAAAAAAAA0ie0AAAAAAAAA0CS2AwAAAAAAAECT2A4AAAAAAAAATWI7AAAAAAAAADSJ7QAAAAAAAADQJLYDAAAAAAAAQJPYDgAAAAAAAABNYjsAAAAAAAAANIntAAAAAAAAANAktgMAAAAAAABAk9gOAAAAAAAAAE1iOwAAAAAAAAA0ie0AAAAAAAAA0CS2AwAAAAAAAECT2A4AAAAAAAAATWI7AAAAAAAAADSJ7QAAAAAAAADQJLYDAAAAAAAAQJPYDgAAAAAAAABNYjsAAAAAAAAANIntAAAAAAAAANAktgMAAAAAAABAk9gOAAAAAAAAAE1iOwAAAAAAAAA0ie0AAAAAAAAA0CS2AwAAAAAAAECT2A4AAAAAAAAATWI7AAAAAAAAADSJ7QAAAAAAAADQJLYDAAAAAAAAQJPYDgAAAAAAAABNYjsAAAAAAAAANIntAAAAAAAAANAktgMAAAAAAABAk9gOAAAAAAAAAE1iOwAAAAAAAAA0ie0AAAAAAAAA0CS2AwAAAAAAAECT2A4AAAAAAAAATWI7AAAAAAAAADSJ7QAAAAAAAADQJLYDAAAAAAAAQJPYDgAAAAAAAABNYjsAAAAAAAAANIntAAAAAAAAANAktgMAAAAAAABA055lLwAA4Gjt27cvBw8eXPYy2vbu3Zv9+/cvexkAAAAAAGyB2A4AHLMOHjyYAwcOLHsZAAAAAADsQmI7AHDMqzouJ5xy6rKXMdehT38iYxxe9jIAAAAAANgGYjsAcMw74ZRT87jzLlr2Mua6+Nyzc901Vy17GQAAAAAAbIPjlr0AAAAAAAAAADjWiO0AAAAAAAAA0CS2AwAAAAAAAECT2A4AAAAAAAAATWI7AAAAAAAAADSJ7QAAAAAAAADQJLYDAAAAAAAAQJPYDgAAAAAAAABNYjsAAAAAAAAANIntAAAAAAAAANAktgMAAAAAAABAk9i+hqo6r6rGzOcbN3HNWVV1YVVdUVXXT9sLq+qsxu/uqaofrKq3VdVVVXVdVX2wql5WVQ/Yyt8EAAAAAAAAwPbZs+wF3NJU1dck+eHG+OOSvDzJU1aduuv0eXxVXZDk6WOMwxvMc8ckb0zykFWn7pXkaUmeVFXPGmNcsNm1AQAAAAAAALAY7myfMRPO9yT5+CYv+7ncHNrfneSJSc6Ytu+evn9qkp/d4HdvleTC3BzaX5/krCRfl+Q501qOT/Kyzp3yAAAAAAAAACyG2P6FnpOV4P3eJK+YN7iqTk/y/Olwf5IzxxivHWNcOsZ4bZKHTd8nyQuq6j7rTPWkaWySvHSM8V1jjP85xnjnGOP8JGcm+aes/P96SVV5IgEAAAAAAADAEontk6q6e5KfmQ5/MMkNm7jsubn5UfzPHmNcN3tyjPG5JM+eDvcked468xwJ9p9M8oLVJ8cYH0jy89PhfZJ8xybWBgAAAAAAAMCCiO03+40kt03y6jHGW+cNrqpKcvZ0+N4xxjvWGjd9/77p8Ozputl5Tk9yv+nwdVOgX8urZvbFdgAAAAAAAIAlEtuTVNUTknxbVu4sf/6c4UfcM8ldpv15cf7I+bsmuceqcw9bY9wXGWMcTHLZdHjm5pYIAAAAAAAAwCLs+nd/V9Xtk/zadHjuGOPqTV56/5n9984ZO3v+fkk+tIV5Tk9yt6o6aYxx7dxVTqrqtDlD9m52LgAAAAAAAIDdbtfH9iS/mJXQfEmSVzSum43XV8wZe/nM/t22YZ6arnvfBmM3WgMAAAAAAAAAW7CrHyNfVQ9P8tQkNyb5wTHGaFx+u5n9z84ZO3sH+m0XNA8AAAAAAAAAO2TX3tleVbdO8vKs3CX+K2OMv2tOccLM/g1zxl4/s3/iguaZZ/Ud9avtTXJpc04AAAAAAACAXWnXxvYk/znJfZN8JMlPHcX1h2b2bz1n7PEz+9fNmedQ1rfRPBsaY2z4iPqq6kwHAAAAAAAAsKvtysfIV9V9k/zodPjsMca1G41fx2dm9uc90v2kmf3Vj4rfrnkAAAAAAAAA2CG79c7252XlLvJ/SHKbqvreNcY8cGb/m6tq77T/B1Ocn71T/LQ5vzf7CPfLV51bPc/Vm5hnrLoOAAAAAAAAgB20W2P7kcex3yvJazYx/idm9u+Z5Nok75n57r5zrp89//erzq2e5282Mc/lR3k3PgAAAAAAAADbYFc+Rn6bfCjJR6f9R84Z+4hpeyDJh1ede/vM/rrzTHfWnz4dXrK5JQIAAAAAAACwCLsyto8xzhlj1EafJD81c8k3zZz78DTHSHLRdP6+VfXQtX5r+v7IHekXTdfNruWy3Hy3+xOq6jbrLPucmf0LN/u3AgAAAAAAALD9dmVs30a/muSmaf/8qjpx9uR0fP50eOM0fi2/NG3vkOQXV5+sqnsn+dHp8AMR2wEAAAAAAACWSmzfgumu9BdPh/uSXFJV31NV+6rqe7LyuPd90/kXjzHev85Ur87Nj4Z/ZlX9flV9a1WdUVXPSvKXSU5OcjjJc8YYNy7kDwIAAAAAAABgU/YsewFfAn4syZ2SPDnJg5O8do0xr0jy4+t
"text/plain": [
"<Figure size 2400x1200 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"plt.figure(figsize=(12,6),dpi=200)\n",
"sns.histplot(data=df[df['pdays']!=999],x='pdays')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Contact duration - contact with customer made, how long did call last?**"
]
},
{
"cell_type": "code",
"execution_count": 176,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(0.0, 2000.0)"
]
},
"execution_count": 176,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAACBAAAAQICAYAAAC++rrBAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAB7CAAAewgFu0HU+AAC16UlEQVR4nOzdebxVZb0/8M8DBAhqCoqYkDhLaUqiWJkNmmXadcistBJzqMzK5p/XShvurasNN7XBMkWzLDWHnK6opTmPWB4H1EQFBRXBWYFzWL8/zua4OZyJ4ZzD8H6/Xvu1n7XW83zXdx+jf/ZnP6tUVRUAAAAAAAAAYNXWp7cbAAAAAAAAAAB6nwABAAAAAAAAACBAAAAAAAAAAAAIEAAAAAAAAAAAESAAAAAAAAAAACJAAAAAAAAAAABEgAAAAAAAAAAAiAABAAAAAAAAABABAgAAAAAAAAAgAgQAAAAAAAAAQAQIAAAAAAAAAIAIEAAAAAAAAAAAESAAAAAAAAAAACJAAAAAAAAAAABEgAAAAAAAAAAAiAABAAAAAAAAAJCkX283wMqjlDIgyda1w6eTNPViOwAAAAAAAADLUt8k69bGd1dVNac3m+kOAgQsS1snua23mwAAAAAAAADoZtsnub23m1jWPMIAAAAAAAAAALADAcvU0wsGt956a9Zff/3e7AUAAAAAAABgmZk+fXp22GGHBYdPdzR3RSVAwLLUtGCw/vrrZ8SIEb3ZCwAAAAAAAEB3aep8yorHIwwAAAAAAAAAAAECAAAAAAAAAECAAAAAAAAAAACIAAEAAAAAAAAAEAECAAAAAAAAACACBAAAAAAAAABABAgAAAAAAAAAgAgQAAAAAAAAAAARIAAAAAAAAAAAIkAAAAAAAAAAACTp19sNAAAAAAAAwMpg/vz5efHFF/P8889n7ty5aWpq6u2WgDp9+/ZN//79s+aaa2b11VdPnz5+b9+aAAEAAAAAAAAspRdeeCGPP/54qqrq7VaAdjQ2NmbOnDl54YUXUkrJBhtskDXWWKO321quCBAAAAAAAADAUmgrPFBKSd++fXuxK6C1pqamln+nVVXl8ccfFyJoRYAAAAAAAAAAltD8+fMXCg+svvrqGTJkSAYNGpRSSi93B9Srqiovv/xyZs2alRdffLElRLD55pt7nEGNvwIAAAAAAAAsoQVfQibN4YERI0Zk8ODBwgOwHCqlZPDgwRkxYkRWX331JM2hghdffLGXO1t+CBAAAAAAAADAEnr++edbxkOGDBEcgBVAKSVDhgxpOa7/d7yqEyAAAAAAAACAJTR37twkzV9IDho0qJe7Abqq/jEjC/4dI0AAAAAAAAAAS6ypqSlJ0rdvX7sPwAqklJK+ffsmee3fMQIEAAAAAAAAAEAECAAAAAAAAACACBAAAAAAAAAAABEgAAAAAAAAAAAiQAAAAAAAAAAARIAAAAAAAAAAAIgAAQAAAAAAAAAQAQIAAAAAAACANo0fPz6llIwaNaq3W+nUcccdl1JKSim93QorMAECAAAAAAAAAECAAAAAAAAAAAAQIAAAAAAAAAAAIkAAAAAAAAAAdLMbbrghhx56aLbYYousueaa6d+/f0aMGJE999wzv/jFL/Lss8+2ue7iiy/OfvvtlxEjRmTAgAEZOnRo3va2t+VHP/pRXnzxxXbvN2HChJRSUkrJI488kvnz5+c3v/lN3v72t2fttdfO4MGD85a3vCX/9V//lZdffnmR9ccdd1xKKTnjjDOSJI8++mhLvfpXvblz5+biiy/OkUceme233z5rr712Xve612Xo0KEZN25cjjvuuMycObNLf685c+bkN7/5TfbYY49ssMEGGTBgQAYPHpw3v/nNOfTQQ3PFFVekqqqFPut3v/vdlvVt9frII4906d6s2vr1dgMAAAAAAADAyumVV17JIYcckrPPPnuRa48//ngef/zxXHrppXn66adz3HHHtVx79dVXc8ABB+SCCy5YaM2sWbNy88035+abb85JJ52USy+9NNtuu22HPbz88svZbbfdcvXVVy90/u67787dd9+dv/71r/nb3/6WwYMHL/HnTJLDDz+8JXDQuudbb701t956a04++eRcdNFFecc73tFunbvuuiv77rtvpkyZstD5uXPn5t577829996b3/3ud5kyZUpGjRq1VD1DawIEAAAAAAAAwDI3f/787LXXXrnyyiuTJJtttlmOOOKIjB07NoMGDcr06dNz44035pxzzllk7UEHHdQSHthmm23y1a9+NaNHj86sWbPypz/9KRMmTMgTTzyRXXbZJf/617+ywQYbtNvHYYcdlptvvjkHHXRQ9t9//wwfPjyPPfZYjj/++Nx000259dZb84Mf/CA//OEPW9YcccQR2W+//fKtb30rF110Ud7whjfkiiuu6PDzNjY2ZuONN84+++yTHXbYIW984xvTr1+/PProo7nqqqty2mmn5Zlnnsk+++yThoaGDBs2bJEa9913X975zne27K6wzz775GMf+1g23njjNDU15YEHHsjEiRMXClbsvffeGTt2bH75y1/mV7/6VZLmcERrHf2NYIGyYGsLWFqllBFJpibJ1KlTM2LEiF7uCAAAAAAAoHs9+OCDaWxsTL9+/bLZZpv1djvLlRNPPDFf+tKXkjR/EX722WdnwIABi8ybP39+pk+f3vIF96WXXpo999wzSbLLLrvksssuS//+/Rda89vf/jaHH354kmT//ffPn//854WuT5gwIQcffHDL8e9///t84hOfWGjOnDlzMnbs2DQ0NGTo0KGZMWNG+vVb+PfX48ePzxlnnJENN9yw00cA/Pvf/87GG2+8yKMNFrj77rvz9re/PS+++GK+9a1v5fvf//4ic7bbbrvceeed6dOnT/7whz/kYx/7WJu1nnnmmQwaNCirrbZay7njjjuu5TEGvgPumsX99ztt2rSMHDlyweHIqqqmdWuDvaBPbzcAAAAAAAAArFzmz5+fE044IUkyYsSInHnmmW2GB5KkT58+C/06/he/+EWS5HWve11OP/30RcIDSfOuArvuumuS5Pzzz8/06dPb7WXfffddJDyQJAMGDMiRRx6ZpPkL+XvvvbeLn65tm2yySbvhgSTZeuutc+ihhyZJLrzwwkWuT5w4MXfeeWeS5Itf/GK74YEkGTp06ELhAVhWBAgAAAAAAACAZequu+7KtGnNP84+7LDDsvrqq3dpXWNjY6699tokyW677Vb/a+9FHHbYYS1rrrnmmnbnHXjgge1e22677VrGDz/8cJd67KrZs2fn3//+d+655540NDSkoaEha621VpLk3nvvzbx58xaaf8kll7SMjzrqqGXaC3RVv86nAAAAAAAAAHTdpEmTWsbvfOc7u7zu4Ycfzssvv5wkGTduXIdz6683NDS0O2/LLbds99qQIUNaxi+88EJX22zX3XffnZ/97Ge5/PLLM2PGjHbnzZ8/P7Nnz86wYcNazi34m73xjW/MhhtuuNS9wJIQIAAAAAAAAACWqZkzZ7aM119//S6vmzVrVsu4/sv1tgwfPrzNda0NGjSo3Wt9+ry2YXtTU1NXWmzX7373u3z2s59NY2Njl+a/8sorCx0v+Jstzt8LljWPMAAAAAAAAACWO6WU3m6hy+6///6W8MCwYcNywgkn5I477sgzzzyTuXPnpqqqVFWV3/3udy1rqqrqxY6hbXYgAAAAAAAAAJapddZZp2U8ffr0Dh8jUK/+kQJPPvlkh3PrHxFQv643TJgwIY2Njenbt2+uvfbadj9vRzslLPibTZ8+vVt6hK6wAwEAAAAAAACwTL31rW9tGf/jH//o8rqNN9645ZEDt9xyS4dzb7311pbxVltttZgddk1Xd0G45557kiTbbLNNh2GJ22+/vd1rC/5mjz32WB599NHF6LLZirRjA8svOxDAcmju3LmZNGnSYq0ZM2ZM+vfv300dAQAAAAAAdN0222yTkSNHZurUqTn11FPz1a9+Nauvvnqn6/r165d3vetdufzyy3PllVdm2rRpGTFiRJtzTz311JY17373u5dl+y0GDhyYJJkzZ06H8xobG5MkL73
"text/plain": [
"<Figure size 2400x1200 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"plt.figure(figsize=(12,6),dpi=200)\n",
"sns.histplot(data=df,x='duration',hue='contact')\n",
"plt.xlim(0,2000)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* 15 - previous: number of contacts performed before this campaign and for this client (numeric)\n",
"* 16 - poutcome: outcome of the previous marketing campaign (categorical: \"unknown\",\"other\",\"failure\",\"success\""
]
},
{
"cell_type": "code",
"execution_count": 177,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:xlabel='previous', ylabel='count'>"
]
},
"execution_count": 177,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAB/8AAAQICAYAAADr6xj7AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAB7CAAAewgFu0HU+AACcD0lEQVR4nOzde7ht53g3/u8dO4kcaGkQstOmkoak9K2KoFHh8r6UaIWqQxVBglJKSVUPqrmol9TvpfSApEIPDm1tEaK8bQnikPBqSyUiQbN3yMkpETlI8vz+mGPbs7NzzbnW3mvNlSf787mucY3nmc8z7nGvnfz3nWPMaq0FAAAAAAAAAOjXLuvdAAAAAAAAAACwY4T/AAAAAAAAANA54T8AAAAAAAAAdE74DwAAAAAAAACdE/4DAAAAAAAAQOeE/wAAAAAAAADQOeE/AAAAAAAAAHRO+A8AAAAAAAAAnRP+AwAAAAAAAEDnhP8AAAAAAAAA0DnhPwAAAAAAAAB0TvgPAAAAAAAAAJ0T/gMAAAAAAABA54T/AAAAAAAAANA54T8AAAAAAAAAdE74DwAAAAAAAACd27DeDcCkqto9yd2H6WVJbljHdgAAAAAAAABW0y2S3G4Yf661du1qFBX+c1N09yRnr3cTAAAAAAAAAGvsXkk+vRqFvPYfAAAAAAAAADrnyX9uii7bOjjrrLNyxzvecT17AQAAAAAAAFg1X//613P44YdvnV42a+9KCP+5Kbph6+COd7xjNm7cuJ69AAAAAAAAAKyVG+ZvWR6v/QcAAAAAAACAzgn/AQAAAAAAAKBzwn8AAAAAAAAA6JzwHwAAAAAAAAA6J/wHAAAAAAAAgM4J/wEAAAAAAACgc8J/AAAAAAAAAOic8B8AAAAAAAAAOif8BwAAAAAAAIDOCf8BAAAAAAAAoHPCfwAAAAAAAADonPAfAAAAAAAAADon/AcAAAAAAACAzgn/AQAAAAAAAKBzwn8AAAAAAAAA6JzwHwAAAAAAAAA6J/wHAAAAAAAAgM4J/wEAAAAAAACgc8J/AAAAAAAAAOic8B8AAAAAAAAAOif8BwAAAAAAAIDOCf8BAAAAAAAAoHPCfwAAAAAAAADonPAfAAAAAAAAADon/AcAAAAAAACAzgn/AQAAAAAAAKBzwn8AAAAAAAAA6JzwHwAAAAAAAAA6J/wHAAAAAAAAgM4J/wEAAAAAAACgc8J/AAAAAAAAAOic8B8AAAAAAAAAOif8BwAAAAAAAIDOCf8BAAAAAAAAoHPCfwAAAAAAAADonPAfAAAAAAAAADon/AcAAAAAAACAzgn/AQAAAAAAAKBzwn8AAAAAAAAA6JzwHwAAAAAAAAA6J/wHAAAAAAAAgM5tWO8GgJ3DhSfcfb1b2On86Es+t94tAAAAAAAAsCCe/AcAAAAAAACAzgn/AQAAAAAAAKBzwn8AAAAAAAAA6JzwHwAAAAAAAAA6J/wHAAAAAAAAgM4J/wEAAAAAAACgc8J/AAAAAAAAAOic8B8AAAAAAAAAOif8BwAAAAAAAIDOCf8BAAAAAAAAoHPCfwAAAAAAAADonPAfAAAAAAAAADon/AcAAAAAAACAzgn/AQAAAAAAAKBzwn8AAAAAAAAA6JzwHwAAAAAAAAA6J/wHAAAAAAAAgM4J/wEAAAAAAACgc8J/AAAAAAAAAOic8B8AAAAAAAAAOif8BwAAAAAAAIDOCf8BAAAAAAAAoHPCfwAAAAAAAADonPAfAAAAAAAAADon/AcAAAAAAACAzgn/AQAAAAAAAKBzwn8AAAAAAAAA6NxOG/5X1WFV9ZKq+mBVbamqa6vqu1V1XlW9uarut8J6D62qTWO1tgzzh66gxoaqemZVfbSqLquqq6vqgqp6Q1X95Arq7FNVJ1TVv1fVFcPx78NnP7KCOncb7n3B0MtlQ2/PrKoNy60DAAAAAAAAwNqq1tp697BwVfWRJD+3jK1vTXJca+26GbV2SfLGJE+bUeekJM9ord04o84+SU5Pcq8ltlyb5NdbayfNariq7p3k3Un2XWLL15Mc3Vo7a06d45K8PsluS2w5K8lRrbXLZ9XZHlW1McnmJNm8eXM2bty42rdgHVx4wt3Xu4Wdzo++5HPr3QIAAAAAAAATtmzZkv3333/rdP/W2pbVqLuzPvl/p+H8tSSvTfLoJIcnuW+S30xy0bD+pCSnzKn18mwL/j+b5PFDrccP8yQ5NsnLlipQVbdIsinbgv93JXloknsneW6SS5PsnuQNs94kUFX7Jzkto+D/+iSvSnL/4XjV8Nkdk5w2BOxL1XlYkr/IKPi/ZOjh3kNP7xq2HZ5k09A7AAAAAAAAAOtoZ33y/70ZPdX/D621G6as75PkzCQHDx8d2Vr7yJR9Byf5jyQbknw6yf1ba1ePre+Z5Iwkh2UUvB/SWjt/Sp2nJjl5mP5Za+3ZE+sHJflMklsnOX+oc/2UOm9N8sRh+pjW2t9NrD8myTuG6Vtaa8dMqbFrknOT3DnJFUl+prV2wcSeP03yrGH6lNbaKZN1doQn/2+ePPm/eJ78BwAAAAAAuOnx5P8qaq09vLX2zmnB/7B+eZIXjH306CVKPS+j4D9JnjMe/A91vpfkOcN0Q5LnL1HnhcP5m0mOn9LP+UleMUwPSvLIyT1VtW+SJwzTD0wG/0Oddyb5wDB94nDNpEdmFPwnySsmg//B8Um+NTYGAAAAAAAAYB3tlOH/Mn1obHzg5GJVVZJHDNNzW2ufnFZk+PyLw/QRw3XjdQ5OcsgwfefwhYFpThkb/7fwP8kvZtt/zzcvUWO8zi7DNZOOXuKePzD0+M5heujwNwAAAAAAAACwToT/S9t9bDztDQE/nuROw/iMObW2ru+X5ICJtftN2ffftNYuTnLeMD1iypZl1ZlYm1Xni8M9t7cOAAAAAAAAAAsi/F/akWPjc6asHzo2PndOrfH1QybWtqfO/lW11xJ1vjMrtG+tfT3JFdN6qaq9k2z9cYkd+ZsAAAAAAAAAWKAN87fsfKpqlyS/PfbRO6ds2zg23jKn5Oax8f4Ta9tTp4brvji2trXOvBpb6/zkKvWSKXVmqqqNc7bsu5J6AAAAAAAAADs74f90z09y+DB+V2vtM1P23Gps/N059a4aG++9xnXm1Rivs1a9zLN5/hYAAAAAAAAAlstr/ydU1ZFJ/vcwvTTJry2x9ZZj4+vmlL12bLzHGteZV2O8zlr1AgAAAAAAAMACefJ/TFX9ZJJNGf27XJPkl1trly6x/Zqx8W5zSu8+Nr56Tp1rsrR5dfZcRi/jdeb1spwa0+rMM+9nAvZNcvYKawIAAAAAAADstIT/g6r68SQfTHKbJDckeVxr7SMzLrlybDzvtfd7jY0nX6c/WWdW+D+vzp7L6GW8zrxellNjWp2ZWmtbZq1X1UrKAQAAAAAAAOz0vPY/SVXdKck/JblTkpbkqa21U+dcNh5gb5yzd/xJ98nfu9+eOm3iuvE682qM15ns5aLt6GVaHQAAAAAAAAAWaKcP/6tqnyT/N8mdh4+e01p76zIu/cLY+K5z9o6vn7MKdTa31q5aos4PVdW+SxWoqjsmufW0XlprV2ZbkL8jfxMAAAAAAAAAC7RTh/9V9UNJPpDk0OGj326t/ekyL/9Kkq8N4yPn7L3/cL4oyVcn1j42Nl6yzhDoHzxMz5yyZVl1JtZm1bnLrC8RLKMOAAAAAAAAAAuy04b/VbVnkvcl+Znho5e31l653Otbay3J1p8GuGtV3WeJ+9wn256SP3W4brzOedn25Pxjhr6mOWZsvGnK+nuS3DiMnzKj9a11bhyumfTuJe75A0OPjxmmXxj+BgAAAAAAAADWyU4Z/lfVbhkF6EcMH722tfZ721HqNUluGMavq6o9Ju6zR5LXDdPrh/3T/PFwvm2SV03p98AkLx6m52dK+N9auzjJ3wzTh1TVo6fU+eUkDxmmfzVcM2lTki8P4xcP9550YpLbjI0BAAAAAAAAWEcb1ruBdfK2JA8exv+S5OSqutuM/ddNe7q9tXZeVZ2Y5LeTHJbkzKp6ZZILkhyY5EVJ7jFsP7G19qUl6r8lyVMz+jLCs4f
"text/plain": [
"<Figure size 2400x1200 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"plt.figure(figsize=(12,6),dpi=200)\n",
"sns.countplot(data=df,x='previous',hue='contact')"
]
},
{
"cell_type": "code",
"execution_count": 178,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:xlabel='contact', ylabel='count'>"
]
},
"execution_count": 178,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAZEAAAEGCAYAAACkQqisAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAUNklEQVR4nO3df7DddX3n8edLfmz9RQmSUiSwYTRbja5GyALW7i6LLgSm26BLXbBKQKbprmDtjnVFZ7dQLK3d+mOkKltaI8FVKQWRrBvFDGJdqygXofyUIYsoySAEQsWWigO+94/zuXo23CSXz805l5v7fMycOd/z/n4+3+/nO3PDi++P8zmpKiRJ6vGM2R6AJGnuMkQkSd0MEUlSN0NEktTNEJEkddtztgcwbvvvv38tXrx4tochSXPKDTfc8GBVLdy2Pu9CZPHixUxMTMz2MCRpTkny3anqXs6SJHUzRCRJ3QwRSVI3Q0SS1M0QkSR1M0QkSd0MEUlSN0NEktTNEJEkdZt331iXdmffO++fz/YQ9DR0yO/dMrJteyYiSepmiEiSuhkikqRuhogkqZshIknqNrIQSXJwkmuT3J7ktiRva/Vzk2xOclN7nTDU511JNia5M8lxQ/UVrbYxydlD9UOTfKPV/zLJ3qM6HknSk43yTORx4O1VtRQ4CjgzydK27oNVtay91gO0dScDLwFWAB9NskeSPYCPAMcDS4FThrbzx21bLwQeBs4Y4fFIkrYxshCpqvuq6ltt+YfAHcBBO+iyEri0qh6rqu8AG4Ej2mtjVd1dVT8GLgVWJglwDHB5678WOHEkByNJmtJY7okkWQy8AvhGK52V5OYka5IsaLWDgHuHum1qte3Vnwf8XVU9vk19qv2vTjKRZGLLli274pAkSYwhRJI8B7gC+J2qegS4EHgBsAy4D3j/qMdQVRdV1fKqWr5w4ZN+Z16S1Gmk054k2YtBgHyyqj4DUFX3D63/c+Bz7eNm4OCh7otaje3UHwL2TbJnOxsZbi9JGoNRPp0V4GPAHVX1gaH6gUPNXgvc2pbXAScn+SdJDgWWAN8ErgeWtCex9mZw831dVRVwLXBS678KuGpUxyNJerJRnom8CngTcEuSm1rt3QyerloGFHAP8FsAVXVbksuA2xk82XVmVT0BkOQs4GpgD2BNVd3WtvdO4NIkfwDcyCC0JEljMrIQqaqvApli1fod9DkfOH+K+vqp+lXV3Qye3pIkzQK/sS5J6maISJK6GSKSpG6GiCSpmyEiSepmiEiSuhkikqRuhogkqZshIknqZohIkroZIpKkboaIJKmbISJJ6maISJK6GSKSpG6GiCSpmyEiSepmiEiSuhkikqRuhogkqZshIknqZohIkroZIpKkboaIJKmbISJJ6maISJK6GSKSpG6GiCSpmyEiSepmiEiSuo0sRJIcnOTaJLcnuS3J21p9vyQbktzV3he0epJckGRjkpuTHDa0rVWt/V1JVg3VD09yS+tzQZKM6ngkSU82yjORx4G3V9VS4CjgzCRLgbOBa6pqCXBN+wxwPLCkvVYDF8IgdIBzgCOBI4BzJoOntfnNoX4rRng8kqRtjCxEquq+qvpWW/4hcAdwELASWNuarQVObMsrgUtq4Dpg3yQHAscBG6pqa1U9DGwAVrR1+1TVdVVVwCVD25IkjcFY7okkWQy8AvgGcEBV3ddWfR84oC0fBNw71G1Tq+2ovmmK+lT7X51kIsnEli1bZnYwkqSfGnmIJHkOcAXwO1X1yPC6dgZRox5DVV1UVcuravnChQtHvTtJmjdGGiJJ9mIQIJ+sqs+08v3tUhTt/YFW3wwcPNR9UavtqL5oirokaUxG+XRWgI8Bd1TVB4ZWrQMmn7BaBVw1VD+1PaV1FPCDdtnrauDYJAvaDfVjgavbukeSHNX2derQtiRJY7DnCLf9KuBNwC1Jbmq1dwPvBS5LcgbwXeD1bd164ARgI/AocDpAVW1N8h7g+tbuvKra2pbfAlwMPBP4fHtJksZkZCFSVV8Ftve9jVdP0b6AM7ezrTXAminqE8BLZzBMSdIM+I11SVI3Q0SS1M0QkSR1M0QkSd0MEUlSN0NEktTNEJEkdTNEJEndDBFJUjdDRJLUzRCRJHUzRCRJ3QwRSVI3Q0SS1M0QkSR1M0QkSd0MEUlSN0NEktTNEJEkdTNEJEndDBFJUjdDRJLUzRCRJHUzRCRJ3QwRSVI3Q0SS1M0QkSR1M0QkSd0MEUlSN0NEktTNEJEkdRtZiCRZk+SBJLcO1c5NsjnJTe11wtC6dyXZmOTOJMcN1Ve02sYkZw/VD03yjVb/yyR7j+pYJElTG+WZyMXAiinqH6yqZe21HiDJUuBk4CWtz0eT7JFkD+AjwPHAUuCU1hbgj9u2Xgg8DJwxwmORJE1hWiGS5Jrp1IZV1VeArdMcx0rg0qp6rKq+A2wEjmivjVV1d1X9GLgUWJkkwDHA5a3/WuDEae5LkrSL7LmjlUl+DngWsH+SBUDaqn2Agzr3eVaSU4EJ4O1V9XDb1nVDbTYNbf/ebepHAs8D/q6qHp+i/VTHsRpYDXDIIYd0Dnvg8HdcMqP+2j3d8CenzvYQpFmxszOR3wJuAF7U3idfVwEf7tjfhcALgGXAfcD7O7bxlFXVRVW1vKqWL1y4cBy7lKR5YYdnIlX1IeBDSd5aVX86051V1f2Ty0n+HPhc+7gZOHio6aJWYzv1h4B9k+zZzkaG20uSxmSHITKpqv40yS8Di4f7VNVTuraT5MCquq99fC0w+eTWOuBTST4APB9YAnyTweWzJUkOZRASJwNvqKpKci1wEoP7JKsYnB1JksZoWiGS5BMMLkPdBDzRygVsN0SSfBo4msH9lE3AOcDRSZa1vvcwuFxGVd2W5DLgduBx4MyqeqJt5yzgamAPYE1V3dZ28U7g0iR/ANwIfGw6xyJJ2nWmFSLAcmBpVdV0N1xVp0xR3u5/6KvqfOD8KerrgfVT1O9m8PSWJGmWTPd7IrcCvzjKgUiS5p7pnonsD9ye5JvAY5PFqvq1kYxKkjQnTDdEzh3lICRJc9N0n87661EPRJI090z36awfMniiCmBvYC/gH6pqn1ENTJL09DfdM5HnTi63eatWAkeNalCSpLnhKc/iWwOfBY7bWVtJ0u5tupezXjf08RkMvjfyo5GMSJI0Z0z36ax/N7T8OINvm6/c5aORJM0p070ncvqoByJJmnum+6NUi5Jc2X7u9oEkVyRZNOrBSZKe3qZ7Y/3jDGbafX57/a9WkyTNY9MNkYVV9fGqery9Lgb8dSdJmuemGyIPJXljkj3a640MfhhKkjSPTTdE3gy8Hvg+g5+1PQk4bURjkiTNEdN9xPc8YFVVPQyQZD/gfQzCRZI0T033TORlkwECUFVbgVeMZkiSpLliuiHyjCQLJj+0M5HpnsVIknZT0w2C9wNfT/JX7fOvM8VP2UqS5pfpfmP9kiQTwDGt9Lqqun10w5IkzQXTviTVQsPgkCT91FOeCl6SpEmGiCSpmyEiSepmiEiSuhkikqRuhogkqZshIknqZohIkroZIpKkbiMLkSRr2u+x3zpU2y/JhiR3tfcFrZ4kFyTZmOTmJIcN9VnV2t+VZNVQ/fAkt7Q+FyTJqI5FkjS1UZ6JXAys2KZ2NnBNVS0BrmmfAY4HlrTXauBC+OlswecARwJHAOcMzSZ8IfCbQ/223ZckacRGFiJV9RVg6zbllcDatrwWOHGofkkNXAfsm+RA4DhgQ1Vtbb9nsgFY0dbtU1XXVVUBlwxtS5I0JuO+J3JAVd3Xlr8PHNCWDwLuHWq3qdV2VN80RV2SNEazdmO9nUHUOPaVZHWSiSQTW7ZsGccuJWleGHeI3N8uRdHeH2j1zcDBQ+0WtdqO6oumqE+pqi6qquVVtXzhwoUzPghJ0sC4Q2QdMPmE1SrgqqH6qe0praOAH7TLXlcDxyZZ0G6oHwtc3dY9kuSo9lTWqUPbkiSNych+Jz3Jp4Gjgf2TbGLwlNV7gcuSnAF8F3h9a74eOAHYCDwKnA5QVVuTvAe4vrU7r6omb9a/hcETYM8EPt9ekqQxGlmIVNUp21n
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.countplot(data=df,x='contact')"
]
},
{
"cell_type": "code",
"execution_count": 179,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"4234"
]
},
"execution_count": 179,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# df['previous'].value_counts()\n",
"df['previous'].value_counts().sum()-36954\n",
"# 36954 vs. 8257"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Categorical Features"
]
},
{
"cell_type": "code",
"execution_count": 180,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>age</th>\n",
" <th>job</th>\n",
" <th>marital</th>\n",
" <th>education</th>\n",
" <th>default</th>\n",
" <th>housing</th>\n",
" <th>loan</th>\n",
" <th>contact</th>\n",
" <th>month</th>\n",
" <th>day_of_week</th>\n",
" <th>...</th>\n",
" <th>campaign</th>\n",
" <th>pdays</th>\n",
" <th>previous</th>\n",
" <th>poutcome</th>\n",
" <th>emp.var.rate</th>\n",
" <th>cons.price.idx</th>\n",
" <th>cons.conf.idx</th>\n",
" <th>euribor3m</th>\n",
" <th>nr.employed</th>\n",
" <th>subscribed</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>56</td>\n",
" <td>housemaid</td>\n",
" <td>married</td>\n",
" <td>basic.4y</td>\n",
" <td>no</td>\n",
" <td>no</td>\n",
" <td>no</td>\n",
" <td>telephone</td>\n",
" <td>may</td>\n",
" <td>mon</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>999</td>\n",
" <td>0</td>\n",
" <td>nonexistent</td>\n",
" <td>1.1</td>\n",
" <td>93.994</td>\n",
" <td>-36.4</td>\n",
" <td>4.857</td>\n",
" <td>5191.0</td>\n",
" <td>no</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>57</td>\n",
" <td>services</td>\n",
" <td>married</td>\n",
" <td>high.school</td>\n",
" <td>unknown</td>\n",
" <td>no</td>\n",
" <td>no</td>\n",
" <td>telephone</td>\n",
" <td>may</td>\n",
" <td>mon</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>999</td>\n",
" <td>0</td>\n",
" <td>nonexistent</td>\n",
" <td>1.1</td>\n",
" <td>93.994</td>\n",
" <td>-36.4</td>\n",
" <td>4.857</td>\n",
" <td>5191.0</td>\n",
" <td>no</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>37</td>\n",
" <td>services</td>\n",
" <td>married</td>\n",
" <td>high.school</td>\n",
" <td>no</td>\n",
" <td>yes</td>\n",
" <td>no</td>\n",
" <td>telephone</td>\n",
" <td>may</td>\n",
" <td>mon</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>999</td>\n",
" <td>0</td>\n",
" <td>nonexistent</td>\n",
" <td>1.1</td>\n",
" <td>93.994</td>\n",
" <td>-36.4</td>\n",
" <td>4.857</td>\n",
" <td>5191.0</td>\n",
" <td>no</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>40</td>\n",
" <td>admin.</td>\n",
" <td>married</td>\n",
" <td>basic.6y</td>\n",
" <td>no</td>\n",
" <td>no</td>\n",
" <td>no</td>\n",
" <td>telephone</td>\n",
" <td>may</td>\n",
" <td>mon</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>999</td>\n",
" <td>0</td>\n",
" <td>nonexistent</td>\n",
" <td>1.1</td>\n",
" <td>93.994</td>\n",
" <td>-36.4</td>\n",
" <td>4.857</td>\n",
" <td>5191.0</td>\n",
" <td>no</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>56</td>\n",
" <td>services</td>\n",
" <td>married</td>\n",
" <td>high.school</td>\n",
" <td>no</td>\n",
" <td>no</td>\n",
" <td>yes</td>\n",
" <td>telephone</td>\n",
" <td>may</td>\n",
" <td>mon</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>999</td>\n",
" <td>0</td>\n",
" <td>nonexistent</td>\n",
" <td>1.1</td>\n",
" <td>93.994</td>\n",
" <td>-36.4</td>\n",
" <td>4.857</td>\n",
" <td>5191.0</td>\n",
" <td>no</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 21 columns</p>\n",
"</div>"
],
"text/plain": [
" age job marital education default housing loan contact \\\n",
"0 56 housemaid married basic.4y no no no telephone \n",
"1 57 services married high.school unknown no no telephone \n",
"2 37 services married high.school no yes no telephone \n",
"3 40 admin. married basic.6y no no no telephone \n",
"4 56 services married high.school no no yes telephone \n",
"\n",
" month day_of_week ... campaign pdays previous poutcome emp.var.rate \\\n",
"0 may mon ... 1 999 0 nonexistent 1.1 \n",
"1 may mon ... 1 999 0 nonexistent 1.1 \n",
"2 may mon ... 1 999 0 nonexistent 1.1 \n",
"3 may mon ... 1 999 0 nonexistent 1.1 \n",
"4 may mon ... 1 999 0 nonexistent 1.1 \n",
"\n",
" cons.price.idx cons.conf.idx euribor3m nr.employed subscribed \n",
"0 93.994 -36.4 4.857 5191.0 no \n",
"1 93.994 -36.4 4.857 5191.0 no \n",
"2 93.994 -36.4 4.857 5191.0 no \n",
"3 93.994 -36.4 4.857 5191.0 no \n",
"4 93.994 -36.4 4.857 5191.0 no \n",
"\n",
"[5 rows x 21 columns]"
]
},
"execution_count": 180,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "code",
"execution_count": 181,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAB/8AAASvCAYAAADv8jmtAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAB7CAAAewgFu0HU+AADOd0lEQVR4nOzde7StVX0f/O8Pj6BgsDGoBz309RZeUBmNrwiJEkzatARNBBsvMYmVCDFGxWqiw1hja6wOFW3fVNukRrzFJiZoRUSxmN68YBWkNiavAbwOQSGCpkGRSw/M9489t+fJYu299rnstZmcz2eMZ6w5nznn75n7wH/fNZ9VrbUAAAAAAAAAAOM6YKs3AAAAAAAAAADsHeE/AAAAAAAAAAxO+A8AAAAAAAAAgxP+AwAAAAAAAMDghP8AAAAAAAAAMDjhPwAAAAAAAAAMTvgPAAAAAAAAAIMT/gMAAAAAAADA4IT/AAAAAAAAADA44T8AAAAAAAAADE74DwAAAAAAAACDE/4DAAAAAAAAwOCE/wAAAAAAAAAwOOE/AAAAAAAAAAxO+A8AAAAAAAAAgxP+AwAAAAAAAMDghP8AAAAAAAAAMLhtW70BmFVVByU5pnevTXLrFm4HAAAAAAAAYF+6S5J79/aft9Zu3hdFhf/cER2T5JKt3gQAAAAAAADAJntUks/si0Je+w8AAAAAAAAAg3Pynzuia1cbF198cQ4//PCt3AsAAAAAAADAPnP11VfnuOOOW+1eu97c3SH8547o1tXG4Ycfnh07dmzlXgAAAAAAAAA2y62Lp2yM1/4DAAAAAAAAwOCE/wAAAAAAAAAwOOE/AAAAAAAAAAxO+A8AAAAAAAAAgxP+AwAAAAAAAMDghP8AAAAAAAAAMDjhPwAAAAAAAAAMTvgPAAAAAAAAAIMT/gMAAAAAAADA4IT/AAAAAAAAADA44T8AAAAAAAAADE74DwAAAAAAAACDE/4DAAAAAAAAwOCE/wAAAAAAAAAwOOE/AAAAAAAAAAxO+A8AAAAAAAAAgxP+AwAAAAAAAMDghP8AAAAAAAAAMDjhPwAAAAAAAAAMTvgPAAAAAAAAAIMT/gMAAAAAAADA4IT/AAAAAAAAADA44T8AAAAAAAAADE74DwAAAAAAAACDE/4DAAAAAAAAwOCE/wAAAAAAAAAwOOE/AAAAAAAAAAxO+A8AAAAAAAAAgxP+AwAAAAAAAMDghP8AAAAAAAAAMDjhPwAAAAAAAAAMTvgPAAAAAAAAAIMT/gMAAAAAAADA4IT/AAAAAAAAADA44T8AAAAAAAAADE74DwAAAAAAAACDE/4DAAAAAAAAwOCE/wAAAAAAAAAwOOE/AAAAAAAAAAxO+A8AAAAAAAAAg9u21RuAzXLt7/2Hrd4CS3TvX/ulrd4CAAAAAAAAbBkn/wEAAAAAAABgcMJ/AAAAAAAAABic8B8AAAAAAAAABif8BwAAAAAAAIDBCf8BAAAAAAAAYHDCfwAAAAAAAAAYnPAfAAAAAAAAAAYn/AcAAAAAAACAwQn/AQAAAAAAAGBwwn8AAAAAAAAAGJzwHwAAAAAAAAAGJ/wHAAAAAAAAgMEJ/wEAAAAAAABgcMJ/AAAAAAAAABic8B8AAAAAAAAABif8BwAAAAAAAIDBCf8BAAAAAAAAYHDCfwAAAAAAAAAYnPAfAAAAAAAAAAYn/AcAAAAAAACAwQn/AQAAAAAAAGBwwn8AAAAAAAAAGJzwHwAAAAAAAAAGJ/wHAAAAAAAAgMHtt+F/Vd2nqn6mql5ZVR+uquuqqvXrHXtQ7+SqOreqrqqqm/vnuVV18m7U2FZVz66qj1fVtVV1Y1V9qareXFUP2406h/W/63NVdX2/Ptfv/dBu1Hl4f/aX+l6u7Xt7dlVt22gdAAAAAAAAADbX/hzg/tW+KFJVByT5/SSnzwzdv1+nVtXZSX61tXbbOnUOS3JBkkfNDD0oybOSPKOqntdaO3vBfo5P8v4k22eGjunXGVV1amvt4gV1fiXJv01y4OT23ZKc0K9frqrHt9auW68OAAAAAAAAAJtvvz35P+NrST6yh2tfnV3B/2eTPC3Jcf3zs/3+GUletVaBqrpLknOzK/h/X5KTkxyf5PlJvpnkoCRvXu9NAlV1RJLzsxL870xyVpIT+3VWv3d4kvOrasc6dR6X5N9nJfj/q76H4/ue3tenHZfk3L53AAAAAAAAALbQ/nzy/5VJLklySWvtr6rqAUm+sjsFqurIJC/q3c8kObG1dmPvX1JVH0jy0STHJnlxVb2ttfbFOaWekZXT9Enyu621507GLq6qDye5NMmhSd5YVUe31nbOqfPqJPfu7V9orb1nMvbxqro0yZ8kuU9Wvoxw2py/6a5J3pSVL4Zcn+QxrbUvTab8p6r6d0me0/f89CTvmLMXAAAAAAAAAJZkvz3531r7F621D7bW9ub1/y/Iri9QnDkJ/lef8b0kZ/butiQvXKPO6hcIvp3kxXP2+sUkr+ndhyR54uycqtqe5Bd798KZ4H+1zjlJLuzdp/c1s56YlZ8aSJLXzAT/q16c5K8nbQAAAAAAAAC20H4b/u+tqqokp/TuZa21T82b1+9f3run9HXTOkcmObp3z+lfGJjnHZP27cL/JE/Irv+eb19n66t1DuhrZp26xjO/r+/xnN59aP8bAAAAAAAAANgiwv8998Ak9+vtjy6Yuzp+/yQPmBk7Yc6822mtXZPkit59zJwpG6ozM7Zencv7M/e0DgAAAAAAAABLIvzfcw+dtC9bMHc6fvTM2J7UOaKqDlmjzt+sF9q31q5Ocv28vVTVPZIcsZt7uV0dAAAAAAAAAJZr2+IprGHHpH3VgrlXTtpHzIztSZ3q6y6fjK3WWVRjtc7D9tFeMqfOuqpqx4Ip23enHgAAAAAAAMD+Tvi/535g0v7ugrk3TNr32OQ6i2pM62zWXha5cvEUAAAAAAAAADbKa//33N0m7VsWzL150r77JtdZVGNaZ7P2AgAAAAAAAMASOfm/526atA9cMPegSfvGBXVuytoW1Tl4A3uZ1lm0l43UmFdnkUU/E7A9ySW7WRMAAAAAAABgvyX833PfmbQXvfb+kEl79nX6s3XWC/8X1Tl4A3uZ1lm0l43UmFdnXa21q9Ybr6rdKQcAAAAAAACw3/Pa/z03DbB3LJg7Pek++3v3e1Knzayb1llUY1pndi9f34O9zKsDAAAAAAAAwBIJ//fc5yftoxbMnY7/5T6oc2Vr7YY16tyzqravVaCqDk9y6Ly9tNa+k11B/t78TQAAAAAAAAAskfB/z30lyTd6+7EL5p7YP7+e5KszY5+YtNes0wP9I3v3ojlTNlRnZmy9Ov/3el8i2EAdAAAAAAAAAJZE+L+HWmstyXm9e1RV/ei8ef3+6in58/q6aZ0rsuvk/FOq6uA1HnnapH3unPEPJLmtt395na2v1rmtr5n1/jWe+X19j0/p3c/3vwEAAAAAAACALSL83zu/k+TW3n5TVd19Otj7b+rdnX3+PG/on/dKctbsYFU9OMlLe/eLmRP+t9auSfKHvXtSVT1pTp0nJzmpd9/V18w6N8mXe/ul/dmzXp/kBydtAAAAAAAAALbQtq3ewFapqhOSPGRy67BJ+yFVddp0fmvtHbM1WmtXVNXrk/xmkmOTXFRVr0vypSQPTvKSJI/o01/fWvvCGtt5Z5JnJnlMkuf21+2/JclfJzkuycuTHJqV0/rPb63tXKPOy5L8dJJ7J3l3VR2b5IN97GeS/EZvX5vkt+YVaK39n6o6M8n5/ZkXVdWrklyclcD/V5L8XJ/+iSTvWmMvAAAAAAAAACxJzbyFfr9RVe9I8oyNzm+t1Rp1DshKUP/MdZa/NcmzWmu3rTWhqg5LckGSR60x5eYkz2utnb3ePqvq+Ky8un/7GlOuSXJqa+3TC+r8SpJ/m+TANaZcnOTxrbXr1quzJ6pqR5Irk+TKK6/Mjh079qjOtb/3H/bltriDu/ev/dJWbwEAAAAAAAAWuuqqq3LEEUesdo9orV21L+p67f9eaq3d1lo7Pcnjk5yX5BtJbumf5yV5XGvtjPWC/17nuiSPTvK
"text/plain": [
"<Figure size 2400x1200 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"plt.figure(figsize=(12,6),dpi=200)\n",
"# https://stackoverflow.com/questions/46623583/seaborn-countplot-order-categories-by-count\n",
"sns.countplot(data=df,x='job',order=df['job'].value_counts().index)\n",
"plt.xticks(rotation=90);"
]
},
{
"cell_type": "code",
"execution_count": 182,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAB/8AAAT6CAYAAACzsaDpAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAB7CAAAewgFu0HU+AAC/r0lEQVR4nOzdfbju13wn/vcnjkQSoiU44WSqaEY8tDUilDRqdGikiD5QNa0gVeNpKBlj0GIYJDqj0mkH8dhplXZEhPjR6VRKqETGFENEPIwThEQQIg9Nsn5/7LWd2+3e+977nLPvfVbO63Vd3+te67vW+nzXPvKH63rf63tXay0AAAAAAAAAwLj22ewNAAAAAAAAAAC7RvgPAAAAAAAAAIMT/gMAAAAAAADA4IT/AAAAAAAAADA44T8AAAAAAAAADE74DwAAAAAAAACDE/4DAAAAAAAAwOCE/wAAAAAAAAAwOOE/AAAAAAAAAAxO+A8AAAAAAAAAgxP+AwAAAAAAAMDghP8AAAAAAAAAMDjhPwAAAAAAAAAMTvgPAAAAAAAAAIMT/gMAAAAAAADA4IT/AAAAAAAAADA44T8AAAAAAAAADG7LZm8AplXVfknu3ruXJLluE7cDAAAAAAAAsDvdKMmtevuTrbWrd0dR4T97orsnOXezNwEAAAAAAACwwe6V5GO7o5DX/gMAAAAAAADA4Jz8Z090yXLjnHPOySGHHLKZewEAAAAAAADYbb72ta/lyCOPXO5estrc9RD+sye6brlxyCGHZNu2bZu5FwAAAAAAAICNct38KWvjtf8AAAAAAAAAMDjhPwAAAAAAAAAMTvgPAAAAAAAAAIMT/gMAAAAAAADA4IT/AAAAAAAAADA44T8AAAAAAAAADE74DwAAAAAAAACDE/4DAAAAAAAAwOCE/wAAAAAAAAAwOOE/AAAAAAAAAAxO+A8AAAAAAAAAgxP+AwAAAAAAAMDghP8AAAAAAAAAMDjhPwAAAAAAAAAMTvgPAAAAAAAAAIMT/gMAAAAAAADA4IT/AAAAAAAAADA44T8AAAAAAAAADE74DwAAAAAAAACDE/4DAAAAAAAAwOCE/wAAAAAAAAAwOOE/AAAAAAAAAAxO+A8AAAAAAAAAgxP+AwAAAAAAAMDghP8AAAAAAAAAMDjhPwAAAAAAAAAMTvgPAAAAAAAAAIMT/gMAAAAAAADA4IT/AAAAAAAAADA44T8AAAAAAAAADE74DwAAAAAAAACDE/4DAAAAAAAAwOCE/wAAAAAAAAAwOOE/AAAAAAAAAAxO+A8AAAAAAAAAgxP+AwAAAAAAAMDghP8AAAAAAAAAMDjhPwAAAAAAAAAMTvgPAAAAAAAAAIMT/gMAAAAAAADA4LZs9gZgM9zzxLds9hZgw5138m9v9hYAAAAAAABYECf/AQAAAAAAAGBwe234X1W3rqpfrqoXV9V7q+rSqmr9etMaaxxQVb9SVX9aVedW1beq6p+q6ptV9ZGqemFVbV3Hng6oqn/Xa11WVVdU1flV9YdV9RPrqPMTfc35vcZlveaJVXXAOurct6r+e1X9v6q6qqourqr3VdWj11oDAAAAAAAAgI23N7/2/+u7sriqfjrJ2UluOmP4Fknu069nVtUTW2tvm1PvTknOTPJTU0P/vF8nVNVjWmvvnlPnoUn+e5KDJm4fkOSIfp1QVce21i6cU+eFSV6QH/6CyG2SPCjJg6rqMUl+rbV21Wp1AAAAAAAAANh4e+3J/ylfTvL+da45KDuC/7OTPDfJv0ryL5I8OMlrklzf5/15VR2zUqGqulmS92RH8P+6JA9Mct8kz0vyvV7nbVX1s6vUuUeSt/W53+tr79trva5POyzJe/ozV6rzu0n+IEv/fXw+yROSHJnkuCR/16cdm+QNK9UAAAAAAAAAYHH25pP/L05ybpJzW2tfr6rbJ/niOtZfn+TtSV7UWvv0jPH3V9V7k5yW5EZJTqmqn2qttRlzT8xSKJ8k/661dvLE2Eeq6gNJzsrSCf5XJfmFFfb0R0n2T3Jtkge11j4yMfa/qupzSU7qz3pWkhdOF6iqWyR5Re9+Ocl9WmuXToy/u/9ND03y6Kp6bWvtAyvsBwAAAAAAAIAF2GtP/rfW/qC19u7W2k69/r+19uHW2qNWCP6X55ye5B29e8ck95ieU1U3TvL03v1Mkj+c9awkr+/d+1fVvWbUOTLJz/fu66eC/2V/2J+RJP+2P3vaCUlu3tvPmQz++16uS/LkJNf1WyfOqAEAAAAAAADAAu214f8C/d1E+44zxh+QHWH7m1tr169Q500T7UfMGD9uov3GWQV67bf07o/1Z69U5/Ls+OLCdJ2LkvzP3n3gaj8hAAAAAAAAAMDGE/5vvP0m2tfNGD9qon3WKnU+luT7vX2/VepckeS8VepMPuOH6lTVvkmO7N2PtNauWUOd/ZIcsco8AAAAAAAAADaY8H/j3X+i/ZkZ43eZaJ+/UpHW2rVJLuzdw2dMWb53YZ+7kslnTNc5LMmN5u1lDXUAAAAAAAAAWKAtm72BG7Kq+pkkx/buJ1trs8L/bf3zitbat+eU3J7kp5Pcqqr2a61d3Z9zkyQH9zkXrVagtfatqroiyYFJDl1hL3Pr9L0sm66zqqraNmfK1vXUAwAAAAAAANjbCf83SFXtl+TU7DhJ/7wVpt6sf35vDWWvmGjfNMnVUzXWU+fAXmPWXtZSZ3ov67F9/hQAAAAAAAAA1spr/zfOHyc5orff3Fo7Y4V5N+mf16yh5tUT7f1n1Fhvnf2n7q+nzkp7AQAAAAAAAGDBnPzfAFX13CQn9O65SZ6yyvSr+ue+ayi930T7yhk11lvnyqn766mz0l7WYt7PBGzN0r8bAAAAAAAAAGsg/N/Nqup3k/yn3j0/yUNaa1essuS7/XMtr84/cKI9+Vr+706011Nn+tX+66mz0l7maq1dtNp4Va2nHAAAAAAAAMBez2v/d6OqenSSP+nd/5fkX7XWLp2zbDkIP7CqfmzO3OUT85e01n7w2v3W2lVJvtm72+bs8cezI7jfvsJe5tbJD5/en64DAAAAAAAAwAIJ/3eTqnpYkrdk6d/0a0keOO+Ee/fpifadV6m/Jckde/czq9S5U5+7kslnTNe5IMl18/ayhjoAAAAAAAAALJDwfzeoqgcmeXuWfkbhm1k68f/5NS7/0ET7/qvMOyI7TuyfvUqdA5Pcc5U6k8/4oTqttWuSnNO7P1dV+66hztVJPrbKPAAAAAAAAAA2mPB/F1XVfZOcnmS/JN9J8uDW2v9dR4kP9HVJ8tha+Qfvj59onzZj/J0T7cetsNd9kvx27347yd+tUuegJL+yQp1tSX6xd/+2tfbdWfMAAAAAAAAAWAzh/y6oqp9N8p4snba/IsmxrbXz1lOjn7Z/de8enuTZM57zc0me0LtntdbOnVHnnCQf7N0n9DXTntWfkSR/1Fr7pxlzTs2OLyO8vKpuObWXGyX5kyQ36rdOnvV3AQAAAAAAALA4q/02/A1aVR2V5E4Ttw6eaN+pqo6fnN9ae9PU+jsmeV+SH+u3np/kO1V1t1Ue+43W2jdm3D85yaOSHJbkpKq6U5K/THJlkgck+Q9Z+t/qyiTPWKX+v83Sq/z3T/L+qvpPWTrdv3+S30jyxD7vgiR/OKtAa+2yqnpOkv+W5CeSfLSqXprkk0lu25//gD79ra21D6yyHwAAAAAAAAAWYK8N/5OckOSxK4zdr1+T3jTV//kkt57o/5c1PPNFSV44fbO19t2qOjbJmUl+Kksh/ROnpl2e5DGttf+zUvHW2ser6lFJ/nuWXtv/n2ZMuyBLbyhY8VX9rbXXVNVtk7wgyR2TvGHGtDOTPH6lGgAAAAAAAAAsjtf+7yFaaxcmuUeS5yT5WJJvJ/l+ks9m6YsFP91ae/ca6pyR5Kf7mgt6jW/3ms9Jco/+rHl1/iDJUUn+Isn2JNck+UaSv0nym621Y1trV63rjwQAAAAAAABgQ1RrbbP3AD+kqrZl6QsH2b59e7Zt27bbn3HPE9+y22vCnua8k397s7cAAAAAAADAlIsuuiiHHnrocvfQ1tpFu6Ouk/8AAAA
"text/plain": [
"<Figure size 2400x1200 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"plt.figure(figsize=(12,6),dpi=200)\n",
"# https://stackoverflow.com/questions/46623583/seaborn-countplot-order-categories-by-count\n",
"sns.countplot(data=df,x='education',order=df['education'].value_counts().index)\n",
"plt.xticks(rotation=90);"
]
},
{
"cell_type": "code",
"execution_count": 183,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAB/8AAAT6CAYAAACzsaDpAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAB7CAAAewgFu0HU+AADaJ0lEQVR4nOzdaZhdVZk24GclISEBEg0hBEgUVGZQwwwi0A0KyDwoLTQQZr9GnFFpUWlRtImt3c7IEEAERBAQmaIo8xxQJgVBVKYEMMiUiST7+1GnKieVGjJUqtjkvq/rXGftvdZ+93tK8sfnrH1KVVUBAAAAAAAAAOqrX183AAAAAAAAAAAsGeE/AAAAAAAAANSc8B8AAAAAAAAAak74DwAAAAAAAAA1J/wHAAAAAAAAgJoT/gMAAAAAAABAzQn/AQAAAAAAAKDmhP8AAAAAAAAAUHPCfwAAAAAAAACoOeE/AAAAAAAAANSc8B8AAAAAAAAAak74DwAAAAAAAAA1J/wHAAAAAAAAgJoT/gMAAAAAAABAzQn/AQAAAAAAAKDmhP8AAAAAAAAAUHPCfwAAAAAAAACouQF93QC0V0oZlGTjxuFzSeb0YTsAAAAAAAAAPal/klUa4/urqprZE0WF/7webZzkrr5uAgAAAAAAAGAp2zzJ3T1RyGP/AQAAAAAAAKDm7Pzn9ei51sGdd96Z1VZbrS97AQAAAAAAAOgxzzzzTLbYYovWw+e6WrsohP+8Hs1pHay22moZPXp0X/YCAAAAAAAAsLTM6X7JwvHYfwAAAAAAAACoOeE/AAAAAAAAANSc8B8AAAAAAAAAak74DwAAAAAAAAA1J/wHAAAAAAAAgJoT/gMAAAAAAABAzQn/AQAAAAAAAKDmhP8AAAAAAAAAUHPCfwAAAAAAAACoOeE/AAAAAAAAANTcgL5uAAAAAN5IZs2alVdeeSWvvvpqZs2alblz5/Z1S/CG1b9//yy//PIZOnRoVlhhhZRS+rolAACAPiP8BwAAgB5QVVWef/75PP/8833dCiwzZs+enZkzZ+bFF1/M4MGD85a3vCX9+nnQJQAAsGwS/gMAAEAPeOaZZ/Liiy/Od66Ukv79+/dRR/DGN2fOnFRVlSSZPn16/v73v+etb32rJwAAAADLJOE/AAAALKEZM2bMF/yvvPLKGTp0aAYNGiSEhKVo7ty5eeWVVzJ58uTMmTMn06dPz6uvvpoVV1yxr1sDAADodZ6DBgAAAEvon//8Z9t45MiRGTlyZJZffnnBPyxl/fr1y9ChQzNq1Ki2cy+//HIfdgQAANB3hP8AAACwhKZNm9Y2ftOb3tR3jcAyasUVV2z7ss306dP7uBsAAIC+IfwHAACAJTRnzpwkyYABA9K/f/8+7gaWPf369Wv7t9f67xEAAGBZI/wHAAAAAAAAgJoT/gMAAAAAAABAzQn/AQAAAAAAAKDmhP8AAAAAAAAAUHPCfwAAAAAAAACoOeE/AAAA0CfOPvvslFJSSslf//rXpXafK6+8MjvvvHNGjBiR/v37p5SSN73pTUvtfourt/4eAAAAvDEN6OsGAAAAAJaWH/zgBzn22GP7ug0AAABY6uz8BwAAAN6Qpk2blv/8z/9Mkqy33nq5+OKLc++99+b+++/Pbbfd1sfdLbpx48allJI111yzr1sBAADgdcjOfwAAAOAN6e67786LL76YJPnmN7+Z3XbbrY87AgAAgKXHzn8AAADgDempp55qG6+zzjp92AkAAAAsfcJ/AAAA4A1p5syZbePllluuDzsBAACApU/4DwAAACwVL7zwQj7/+c9nvfXWy+DBgzNy5MjstNNO+fnPf77QNWbMmJHvfe972XHHHTNq1KgMHDiwrc6ZZ56Z2bNnL3DNDjvskFJKDjvssLZza621Vkopba/rr79+vj4nTJiQf//3f88GG2yQFVdcMQMHDsyoUaOy884758c//nFmzZrVaY/XX399h3U70rrupJNOWui/wUknnZRSSs4555wkyd/+9rf5PkvrCwAAgGXbgL5uAAAAAHjj+eMf/5iddtopTz/9dNu5GTNm5Lrrrst1112Xww47LNttt12XNf7whz9kr732yt/+9rf5zj/33HNtdU477bRcccUVWXXVVRe717Fjxy5wjySZMmVKJk6cmIkTJ+ZHP/pRrrrqqowaNWqx7wMAAABLk/AfAAAA6FEvvfRSdt5557bg/4ADDsihhx6akSNH5pFHHsm3vvWtTJgwIQ888ECnNR599NFsv/32efHFFzN06NAce+yx2WKLLTJmzJj84x//yC9/+cucdtppueuuu7LXXnvlpptuanu0/4QJE/Lqq6/m8ssvz4knnpgkufbaa7P66qu31V9rrbXaxnPmzMmWW26Z3XffPWPHjs2qq66aWbNm5fHHH895552Xa665Jvfee2/+7d/+rdud/UvDf/zHf2T//ffPiSeemMsvvzyrr756rr322l7vAwAAgNc34T8AAADQo04++eQ88cQTSZJTTjklJ5xwQtvcpptumv333z+77757Jk6c2GmNQw89NC+++GLGjh2biRMnZsSIEfPNv//978/uu++e3XbbLXfccUfOPvvsHHXUUUnmBft333132/p11lkna665Zof3+u1vf5u11157gfPbbLNNDjrooEyYMCGHH354brjhhlx33XXZcccdF+4P0UNGjhyZkSNH5k1velOSZLnllstGG23Uqz0AAADw+tevrxsAAAAA3jhmzZqVM888M0nyzne+M5///OcXWLPccsvlzDPPbNup395NN92UW2+9NUlyzjnnLBD8t9pll12y//77J0nOPvvsxe65o+C/2WGHHZZ3v/vdSZLLLrtsse8DAAAAS5PwHwAAAOgxkyZNygsvvJCkZfd+KaXDdaNHj8773//+Dud++ctfJknWXXfdbLzxxl3eb7vttkuS3HXXXZk9e/bitt2mqqpMnjw5jzzySB544IG21xprrJEk+cMf/rDE9wAAAIClwWP/AQAAgB5z//33t40333zzLtduscUWufLKKxc43/q4/ocffrjTLw+099prr2Xq1KkZOXLkInQ7z5VXXpkf/vCHufHGG/Pyyy93uu75559frPoAAACwtAn/AQAAgB4zderUtnF3Qfyqq67a4flnn312se49bdq0Rb6mqqocddRRbT9V0J3p06cv8j0AAACgNwj/AQAAgKViYXfttzdnzpwkybve9a6cd955C31d66P5F8VZZ53VFvy/+93vzic+8YlsueWWWWONNTJkyJD0798/SXLIIYfkJz/5SaqqWuR7AAAAQG8Q/gMAAAA95s1vfnPbeMqUKVlnnXU6XTtlypQOz6+88spJkldeeSUbbbRRzzbYzumnn54kecc73pFbb701gwcP7nBd8xMN2uvXr1/beO7cuZ2ue/XVVxezSwAAAOhev+6XAAAAACycjTfeuG181113dbm2s/mxY8cmSf7yl79k8uTJPddcBx588MEkyZ577tlp8F9VVe65555Oa6y00kpt4xdeeKHTdY888shidtlicZ+kAAAAwLJB+A8AAAD0mE033bRt939Xj8l/6qmnMnHixA7n9txzzyQtofv//d//LZ1GG2bPnp2k6135l19+eZ555plO59dcc8228d13393pugsuuGDRG2yy/PLLJ0lmzpy5RHUAAAB4Y/LYf5ZJmx5/bl+30CcmjT+kr1sAAADe4AYNGpTDDjss3/rWt/L73/8+48ePz2c/+9n51syePTtHHXVUZs2a1WGN97///dliiy1y5513Zvz48Rk7dmw+9KEPdXrP+++/P3/961+zxx57LHK/a6+9du6///5cccUVOeWUUzJ8+PD55h977LEce+yxXdZ485vfnHe+85257777MmHChBx//PEL1Ln55puX+IsMq622WpLk2WefzcsvvzzfEwcAAADAzn8AAACgR33pS1/K6NGjkySf+9zncuCBB+aaa67JPffckwsvvDDbbLNNrr766my22Wad1jj//PMzfPjwzJkzJwcccED23HPP/PSnP82dd96ZSZMm5eqrr84pp5ySrbfeOu985ztzww03LFa
"text/plain": [
"<Figure size 2400x1200 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"plt.figure(figsize=(12,6),dpi=200)\n",
"# https://stackoverflow.com/questions/46623583/seaborn-countplot-order-categories-by-count\n",
"sns.countplot(data=df,x='education',order=df['education'].value_counts().index,hue='default')\n",
"plt.xticks(rotation=90);"
]
},
{
"cell_type": "code",
"execution_count": 184,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:xlabel='default', ylabel='count'>"
]
},
"execution_count": 184,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAZEAAAEGCAYAAACkQqisAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAV6UlEQVR4nO3dfbRddX3n8feHIGqrDkiuFBNmQjUzTnA0QARqtcuBKQQ6GnQUYapEZTXOFGbpWtUR27WKgKzROuqSFungmBI61ciolIwTxawMFR3l4fJMQIZbhCEZHlISHiyr0NDv/HF+V4/hJtzs5JzD5b5fa5119v7u3977t++5N5/sx5OqQpKkLvYadQckSTOXISJJ6swQkSR1ZohIkjozRCRJne096g4M29y5c2vBggWj7oYkzSjXX3/931TV2Pb1WRciCxYsYHx8fNTdkKQZJcm9U9U9nCVJ6swQkSR1ZohIkjozRCRJnRkikqTODBFJUmeGiCSpM0NEktSZISJJ6mzW3bG+Kw7/6CWj7sLz3vWfOXXUXZC0G9wTkSR1ZohIkjozRCRJnRkikqTODBFJUmeGiCSpM0NEktSZISJJ6swQkSR1ZohIkjozRCRJnRkikqTOBhYiSV6U5NokNyfZkOTsVj84yTVJJpJ8Lck+rf7CNj7Rpi/oW9bHW/3OJMf11Ze22kSSMwe1LZKkqQ1yT+RJ4Oiqej2wGFia5Cjg08Dnq+rVwFbgtNb+NGBrq3++tSPJIuBk4BBgKfDFJHOSzAEuAI4HFgGntLaSpCEZWIhUz0/b6Avaq4Cjga+3+irgxDa8rI3Tph+TJK2+uqqerKqfABPAEe01UVV3V9VTwOrWVpI0JAM9J9L2GG4CHgLWAX8NPFJV21qTjcC8NjwPuA+gTX8U2L+/vt08O6pP1Y8VScaTjG/evHkPbJkkCQYcIlX1dFUtBubT23N4zSDXt5N+XFRVS6pqydjY2Ci6IEnPS0O5OquqHgGuBH4N2DfJ5Dcqzgc2teFNwEEAbfo/Ah7ur283z47qkqQhGeTVWWNJ9m3DLwZ+E7iDXpi8szVbDlzehte0cdr0/1VV1eont6u3DgYWAtcC1wEL29Ve+9A7+b5mUNsjSXqmQX7H+oHAqnYV1V7ApVX1rSS3A6uTfBK4Efhya/9l4M+TTABb6IUCVbUhyaXA7cA24PSqehogyRnAFcAcYGVVbRjg9kiStjOwEKmqW4BDp6jfTe/8yPb1vwPetYNlnQecN0V9LbB2tzsrSerEO9YlSZ0ZIpKkzgwRSVJnhogkqTNDRJLUmSEiSerMEJEkdWaISJI6M0QkSZ0ZIpKkzgwRSVJnhogkqTNDRJLUmSEiSerMEJEkdWaISJI6M0QkSZ0ZIpKkzgwRSVJnhogkqTNDRJLUmSEiSerMEJEkdTawEElyUJIrk9yeZEOSD7X6J5JsSnJTe53QN8/Hk0wkuTPJcX31pa02keTMvvrBSa5p9a8l2WdQ2yNJeqZB7olsA36vqhYBRwGnJ1nUpn2+qha311qANu1k4BBgKfDFJHOSzAEuAI4HFgGn9C3n021Zrwa2AqcNcHskSdsZWIhU1f1VdUMbfhy4A5i3k1mWAaur6smq+gkwARzRXhNVdXdVPQWsBpYlCXA08PU2/yrgxIFsjCRpSkM5J5JkAXAocE0rnZHkliQrk+zXavOA+/pm29hqO6rvDzxSVdu2q0+1/hVJxpOMb968eU9skiSJIYRIkpcA3wA+XFWPARcCrwIWA/cDnx10H6rqoqpaUlVLxsbGBr06SZo19h7kwpO8gF6A/EVVfROgqh7sm/4l4FttdBNwUN/s81uNHdQfBvZNsnfbG+lvL0kagkFenRXgy8AdVfW5vvqBfc3eDtzWhtcAJyd5YZKDgYXAtcB1wMJ2JdY+9E6+r6mqAq4E3tnmXw5cPqjtkSQ90yD3RH4deC9wa5KbWu336V1dtRgo4B7ggwBVtSHJpcDt9K7sOr2qngZIcgZwBTAHWFlVG9ryPgasTvJJ4EZ6oSVJGpKBhUhV/QDIFJPW7mSe84DzpqivnWq+qrqb3tVbkqQR8I51SVJnhogkqTNDRJLUmSEiSerMEJEkdWaISJI6M0QkSZ0ZIpKkzgwRSVJnhogkqTNDRJLUmSEiSerMEJEkdWaISJI6M0QkSZ0ZIpKkzgwRSVJnhogkqTNDRJLUmSEiSerMEJEkdWaISJI6M0QkSZ0NLESSHJTkyiS3J9mQ5EOt/vIk65Lc1d73a/UkOT/JRJJbkhzWt6zlrf1dSZb31Q9Pcmub5/wkGdT2SJKeaZB7ItuA36uqRcBRwOlJFgFnAuuraiGwvo0DHA8sbK8VwIXQCx3gLOBI4AjgrMngaW1+p2++pQPcHknSdgYWIlV1f1Xd0IYfB+4A5gHLgFWt2SrgxDa8DLikeq4G9k1yIHAcsK6qtlTVVmAdsLRNe1lVXV1VBVzStyxJ0hAM5ZxIkgXAocA1wAFVdX+b9ABwQBueB9zXN9vGVttZfeMU9anWvyLJeJLxzZs3797GSJJ+ZuAhkuQlwDeAD1fVY/3T2h5EDboPVXVRVS2pqiVjY2ODXp0kzRoDDZEkL6AXIH9RVd9s5QfboSja+0Otvgk4qG/2+a22s/r8KeqSpCEZ5NVZAb4M3FFVn+ubtAaYvMJqOXB5X/3UdpXWUcCj7bDXFcCxSfZrJ9SPBa5o0x5LclRb16l9y5IkDcHeA1z2rwPvBW5NclOr/T7wKeDSJKcB9wIntWlrgROACeAJ4P0AVbUlybnAda3dOVW1pQ3/LnAx8GLg2+0lSRqSgYVIVf0A2NF9G8dM0b6A03ewrJXAyinq48Brd6ObkqTd4B3rkqTOphUiSdZPpyZJml12ejgryYuAXwLmtpPak4enXsYO7smQJM0ez3ZO5IPAh4FXAtfz8xB5DPiTwXVLkjQT7DREquoLwBeS/Ieq+uMh9UmSNENM6+qsqvrjJG8EFvTPU1WXDKhfkqQZYFohkuTPgVcBNwFPt/LkQw8lSbPUdO8TWQIsavdySJIETP8+kduAXxlkRyRJM89090TmArcnuRZ4crJYVW8bSK8kSTPCdEPkE4PshCRpZpru1VnfG3RHJEkzz3Svznqcn3951D7AC4C/raqXDapjkqTnvunuibx0crh9d8cy4KhBdUqSNDPs8lN8q+cvgeP2fHckSTPJdA9nvaNvdC9694383UB6JEmaMaZ7ddZb+4a3AffQO6QlSZrFpntO5P2D7ogkaeaZ7pdSzU9yWZKH2usbSeYPunOSpOe26Z5Y/zNgDb3vFXkl8D9aTZI0i003RMaq6s+qalt7XQyMDbBfkqQZYLoh8nCS9ySZ017vAR4eZMckSc990w2RDwAnAQ8A9wPvBN63sxmSrGznT27rq30iyaYkN7XXCX3TPp5kIsmdSY7rqy9ttYkkZ/bVD05yTat/Lck+09wWSdIeMt0QOQdYXlVjVfUKeqFy9rPMczGwdIr656tqcXutBUiyCDgZOKTN88XJvR7gAuB4YBFwSmsL8Om2rFcDW4HTprktkqQ9ZLoh8rqq2jo5UlVbgEN3NkNVXQVsmebylwGrq+rJqvoJMAEc0V4TVXV3VT0FrAaWtUevHA18vc2/CjhxmuuSJO0h0w2RvZLsNzmS5OVM/0bF7Z2R5JZ2uGtymfOA+/rabGy1HdX3Bx6pqm3b1aeUZEWS8STjmzdv7thtSdL2phsinwV+lOTcJOcCPwT+qMP6LqT3Xe2L6Z1b+WyHZeyyqrqoqpZU1ZKxMS8qk6Q9Zbp3rF+SZJzeISSAd1TV7bu6sqp6cHI4yZeAb7XRTcBBfU3ntxo7qD8M7Jtk77Y30t9ekjQk0z4k1UJjl4OjX5IDq+r+Nvp2et/dDr0bGb+S5HP0bmZcCFwLBFiY5GB6IXEy8G+rqpJcSe8qsdXAcuDy3embJGnXdT2v8aySfBV4CzA3yUbgLOAtSRbT+4Kre4APAlTVhiSX0gupbcDpVfV0W84ZwBXAHGBlVW1oq/gYsDrJJ4EbgS8PalskSVMbWIhU1SlTlHf4D31VnQecN0V9LbB2ivrd9K7
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.countplot(data=df,x='default')"
]
},
{
"cell_type": "code",
"execution_count": 185,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<seaborn.axisgrid.PairGrid at 0x209136c9490>"
]
},
"execution_count": 185,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAABugAAAboCAYAAABuzFcBAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAEAAElEQVR4nOydeXxU5b3/P+fMvmcyWUlIQkjYJhCEgNoiVbAWWwQUxNperIqlvdXC1VZtvSoFbXtdqj/XqnVFe1Uqda21VtCiV6kiyhLWEEggZJ0ks+/n/P6YzMmcmecks2SywPN+vfISk5lznnPO9/t9vud5vgvD8zwoFAqFQqFQKBQKhUKhUCgUCoVCoVAoFMrwwI70ACgUCoVCoVAoFAqFQqFQKBQKhUKhUCiUMwm6QUehUCgUCoVCoVAoFAqFQqFQKBQKhUKhDCN0g45CoVAoFAqFQqFQKBQKhUKhUCgUCoVCGUboBh2FQqFQKBQKhUKhUCgUCoVCoVAoFAqFMozQDToKhUKhUCgUCoVCoVAoFAqFQqFQKBQKZRihG3QUCoVCoVAoFAqFQqFQKBQKhUKhUCgUyjAyIht0DMM8yzBMB8Mw+2J+l8swzD8ZhjnS919z3+8ZhmEeZhimgWGYPQzDzErmHIsWLeIB0B/6M5w/aUPllf6M0E9aUHmlPyP0kxZUXunPCP2kBZVX+jNCP2lDZZb+jMBP2lB5pT8j9JMWVF7pzwj9pAWVV/ozQj9pQeWV/ozQjyQjlUH3PIBFcb/7FYCtPM9XA9ja9/8AcDGA6r6fNQD+mMwJurq6hmSgFMpwQOWVMpag8koZS1B5pYwlqLxSxhpUZiljCSqvlLEElVfKWILKK2UsQeWVMtoYkQ06nue3A+iO+/VSAC/0/fsFAMtifr+Jj7ADQA7DMMXDMlAKhUKhUCgUCoVCoVAoFAqFQqFQKBQKZYgZTT3oCnmeb+37dxuAwr5/lwA4EfO5k32/o1AoFAqFQqFQKBQKhUKhUCgUCoVCoVDGHKNpg06A5/lBa3OSYBhmDcMwOxmG2dnZ2ZmFkVEoQweVV8pYgsorZSxB5ZUylqDyShlrUJmljCWovFLGElReKWMJKq+UsQSVV8poZjRt0LVHS1f2/bej7/ctAMbHfK6073cJ8Dz/FM/zdTzP1+Xn52d1sBRKpmQqrxzHo7HThc+OdqGx0wWO4wf8PYWSCYPJa6zcHe9y4WgHlUHKyJGMfaW2kpIuQy07p4u8joUxUoaGwWQ2FOKw+0QP3tvXit0nehEKcSMwSgolQjbWCKiMU7IFXdOiZJOhtl2pyiu1nZSRZLTaV6oXFACQj/QAYngLwI8A/E/ff9+M+f0NDMO8AuBsAPaYUpgUyhkJx/F4r74NN23+Gr4gB7WCxQMrZ+KiqYV4/0B7wu8XWYvAssxID5tymhIrj2atEledW46Hth6hMkgZtUjZUCqnlMEYCdkZC/I6FsZIGR5CIQ5v7G7B7W/sE2Th7mU1WFZbArl8NMWGUijpQWWcQqGMRUbado30+SmU0QjVC0qUEXnaDMO8DOAzAJMZhjnJMMxqRDbmvs0wzBEAF/b9PwC8C6ARQAOAPwH42QgMOWl4nofNZoPNZkOkUieFMvQct7mFRTAA8AU53LT5a9S32om/P25zj+RwKac5sfJ42axSYXMOoDJIGZ1I2VAqp5TBGAnZGQvyOhbGSBke6lvtwiIDEJGF29/Yh/pW+wiPjEIZGqiMUyiUschI266RPj+FMhqhekGJMiIbdDzPX8nzfDHP8wqe50t5nn+G53kbz/MLeZ6v5nn+Qp7nu/s+y/M8fz3P8xN5np/O8/zOkRhzsnR3d+Oqx7fiqse3oru7e6SHQzlNaXf4BAMexRfk0Gon/77D6RvO4VHOMGLlkWFAZZAy6pGyoVROKYMxErIzFuR1LIyRMjxI+aJtdioLlNMDKuMUCmUsMtK2a6TPT6GMRqheUKLQfMksoNQZodQZR3oYlNOYQqMaaoVYfdUKFsUm8u8LDOrhHB7lDCNeHqkMUkY7UjaUyillMEZCdsaCvI6FMVKGh2KThigLRSYqC5TTAyrjFAplLDLStmukz0+hjEaoXlCi0A06CmWUw3E8Gjtd+OxoFxo7XeA4HhUWHR5YOVMw5NFeL9ZiE/H3FRbdSF4C5TQnVh63fHkS6xZWUxmkjGqkbCiVU8pgjITsjAV5HQtjpAwP1mIj7l5WI5KFu5fVwFpsGuGRJQ/J96ZQomRbxqn8UVKFygwlGUZ6fh6p81P9oIxmRlov04HqVHaQj/QAKBSKNBzH4736NqGvS3TBa5G1CIusRZiy9jx0OH0oMKhRYdGBZRnJ31Mo2UQpZ7BmfiU4HpAxwNM/qoOMYVBopDJIGX1QW0nJhFh7xzKR/z8dz5kKVKcoUeRyFstqS1BdoEeb3YcikxrWYtOYaXQ/kO9N5ZkCZFfGqfxRUoXKDCVZRnp+HonzU/2gjHZGWi9ThepU9hidT5xCoQAAjtvcguErNqmxel4lDrY5sLcl0jC0Ml+PcyrzUJmvF4whyzLE31Mo2eK4zY273tmPcF/pbIc/jP9+fS8KjWoqg5RRC7WVlHSIt3ccD9z1zn4ct7mzes4b/vcrPLy1AY9ua8DDWxtww/9+ldE5sxH5SHWKEoVlGRjUChg1ChjUijElC7G+NxDpA3LT5q+zquOUsYdczqJ2vBnfqSlG7XjzoAtpydpcKn+UVKEyc3oz1P5aqrZrKOE4Hs09HngCYVQXGjC9JCfr56f6QRkLjCW/mepU9qAZdBTKKKbd4RM251adU46Htx2BL8jhqe2NNEqBMmqwuf24oq5MkE+1gsXaBdXodvtRma8f6eFRKBTKkDES9i7qC8TiC3LocPrSOieNfKRkk7EuX0OtbxRKKjpB5Y+SKlRmTl/G+nway0hdC9UPymhnrOk51ansQTPoKJRRTKFRDbWCxWWzSoXFQIBGKVBGF0oZmyCfD287AoWMTjEUCuX0YiTsXdQXiEWtYFFgSK95OI18pGSTsS5fQ61vFEoqOkHlj5IqVGZOX8b6fBrLSF0L1Q/KaGes6TnVqexBV0+zBM/z6O7uBs/TZomU9Kmw6PDAypmQsZCMUqBQRhpPIEyUT08gPEIjolAolOwwEvYu6gvENg9/YOVMVFh0aR1voMhHCiVTxrp8DbW+USip6ASVP0qqUJk5fRnr82ksI3UtVD8oo52xpudUp7IHLXGZJYIeJ3785Fb86SdAbm4ucnNzwTCjLz2VMrphWQaLrEUoydHgqe2NIsNdbtFAo5Dhs6NdKDapEQpHanrrlHIUGlUoy9WlnRLNcTyO29xod/hQaFSjwpL+sSinP4VGNS6alocfnjMBPe4gcnUKvLTjGIpNajR2uqgcUUYlgUAYe07Z0ebwodioxvRxJiiVspEeFmWUU2hUo9yiweIZJYi6dW/vbkGhMXtRg1FfYMra89Dh9KHAkJk9LTSqUVduwlXfqITXH4JWJccLnzZmHPk4FnyHsTDGsU40sjbWZx1LkbUsy+DCyQV4afXZaHP4UJGrRYjj8f7+NhSbNLAWG4e1Zw9l7JOKTgy1vR/NhEIc6lvtaLX7iLo12N8pEc4kmYmSrGyMdRka6/NpLONy1Hj0yrPg7vM7/7T9KA53uFBgUGf1OZ2J+kEZOZKR5fjPjMsZW3pOdSp70A26rMLghpe+gFwux6afLYTFYhnpAVHGICzLYHqJCQ+snCmkPpdbNPj5gmpc8dQOmLVKXPPNCjzwz8NCzeJ1C6tRXajHgsmFKRvKsVYDmTLyjDOoceHUcfjJi18KMrNxSQ2abG785KVdVI4oo45AIIw39pzCnW/u65fZpTVYNmMc3aSjDEiZWYufL6jG7W/0y87dy2pQZtZm9bwsy6AyXz8ktf1LTRqsnFOOW17bLZL/UpMm7WOOBd9hLIzxdCAaWRt/n8dKZG0oxOGtvadw+xv7MKlAjyvPLseGt+tF+r6stmRMLfJSRpZUdWIo7f1oJRTi8MbuloS5NKpbg/2dIuZMkJkoycrG6SBDY30+jRIKcfjieI/oWay/xIpxOSqUmjRZf05nkn5
"text/plain": [
"<Figure size 1800x1800 with 110 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# THIS TAKES A LONG TIME!\n",
"sns.pairplot(df)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Clustering"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Data Preparation\n",
"\n",
"**UNSUPERVISED LEARNING REMINDER: NO NEED TO TRAIN TEST SPLIT!! NO LABEL TO \"TEST\" AGAINST!**\n",
"\n",
"We do however need to transform categorical features into numeric ones where it makes sense to do so, as well as scaling the data due to distance being a key factor in clustering."
]
},
{
"cell_type": "code",
"execution_count": 186,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>age</th>\n",
" <th>job</th>\n",
" <th>marital</th>\n",
" <th>education</th>\n",
" <th>default</th>\n",
" <th>housing</th>\n",
" <th>loan</th>\n",
" <th>contact</th>\n",
" <th>month</th>\n",
" <th>day_of_week</th>\n",
" <th>...</th>\n",
" <th>campaign</th>\n",
" <th>pdays</th>\n",
" <th>previous</th>\n",
" <th>poutcome</th>\n",
" <th>emp.var.rate</th>\n",
" <th>cons.price.idx</th>\n",
" <th>cons.conf.idx</th>\n",
" <th>euribor3m</th>\n",
" <th>nr.employed</th>\n",
" <th>subscribed</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>56</td>\n",
" <td>housemaid</td>\n",
" <td>married</td>\n",
" <td>basic.4y</td>\n",
" <td>no</td>\n",
" <td>no</td>\n",
" <td>no</td>\n",
" <td>telephone</td>\n",
" <td>may</td>\n",
" <td>mon</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>999</td>\n",
" <td>0</td>\n",
" <td>nonexistent</td>\n",
" <td>1.1</td>\n",
" <td>93.994</td>\n",
" <td>-36.4</td>\n",
" <td>4.857</td>\n",
" <td>5191.0</td>\n",
" <td>no</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>57</td>\n",
" <td>services</td>\n",
" <td>married</td>\n",
" <td>high.school</td>\n",
" <td>unknown</td>\n",
" <td>no</td>\n",
" <td>no</td>\n",
" <td>telephone</td>\n",
" <td>may</td>\n",
" <td>mon</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>999</td>\n",
" <td>0</td>\n",
" <td>nonexistent</td>\n",
" <td>1.1</td>\n",
" <td>93.994</td>\n",
" <td>-36.4</td>\n",
" <td>4.857</td>\n",
" <td>5191.0</td>\n",
" <td>no</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>37</td>\n",
" <td>services</td>\n",
" <td>married</td>\n",
" <td>high.school</td>\n",
" <td>no</td>\n",
" <td>yes</td>\n",
" <td>no</td>\n",
" <td>telephone</td>\n",
" <td>may</td>\n",
" <td>mon</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>999</td>\n",
" <td>0</td>\n",
" <td>nonexistent</td>\n",
" <td>1.1</td>\n",
" <td>93.994</td>\n",
" <td>-36.4</td>\n",
" <td>4.857</td>\n",
" <td>5191.0</td>\n",
" <td>no</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>40</td>\n",
" <td>admin.</td>\n",
" <td>married</td>\n",
" <td>basic.6y</td>\n",
" <td>no</td>\n",
" <td>no</td>\n",
" <td>no</td>\n",
" <td>telephone</td>\n",
" <td>may</td>\n",
" <td>mon</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>999</td>\n",
" <td>0</td>\n",
" <td>nonexistent</td>\n",
" <td>1.1</td>\n",
" <td>93.994</td>\n",
" <td>-36.4</td>\n",
" <td>4.857</td>\n",
" <td>5191.0</td>\n",
" <td>no</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>56</td>\n",
" <td>services</td>\n",
" <td>married</td>\n",
" <td>high.school</td>\n",
" <td>no</td>\n",
" <td>no</td>\n",
" <td>yes</td>\n",
" <td>telephone</td>\n",
" <td>may</td>\n",
" <td>mon</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>999</td>\n",
" <td>0</td>\n",
" <td>nonexistent</td>\n",
" <td>1.1</td>\n",
" <td>93.994</td>\n",
" <td>-36.4</td>\n",
" <td>4.857</td>\n",
" <td>5191.0</td>\n",
" <td>no</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 21 columns</p>\n",
"</div>"
],
"text/plain": [
" age job marital education default housing loan contact \\\n",
"0 56 housemaid married basic.4y no no no telephone \n",
"1 57 services married high.school unknown no no telephone \n",
"2 37 services married high.school no yes no telephone \n",
"3 40 admin. married basic.6y no no no telephone \n",
"4 56 services married high.school no no yes telephone \n",
"\n",
" month day_of_week ... campaign pdays previous poutcome emp.var.rate \\\n",
"0 may mon ... 1 999 0 nonexistent 1.1 \n",
"1 may mon ... 1 999 0 nonexistent 1.1 \n",
"2 may mon ... 1 999 0 nonexistent 1.1 \n",
"3 may mon ... 1 999 0 nonexistent 1.1 \n",
"4 may mon ... 1 999 0 nonexistent 1.1 \n",
"\n",
" cons.price.idx cons.conf.idx euribor3m nr.employed subscribed \n",
"0 93.994 -36.4 4.857 5191.0 no \n",
"1 93.994 -36.4 4.857 5191.0 no \n",
"2 93.994 -36.4 4.857 5191.0 no \n",
"3 93.994 -36.4 4.857 5191.0 no \n",
"4 93.994 -36.4 4.857 5191.0 no \n",
"\n",
"[5 rows x 21 columns]"
]
},
"execution_count": 186,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": 192,
"metadata": {},
"outputs": [],
"source": [
"X = pd.get_dummies(df)"
]
},
{
"cell_type": "code",
"execution_count": 193,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>age</th>\n",
" <th>duration</th>\n",
" <th>campaign</th>\n",
" <th>pdays</th>\n",
" <th>previous</th>\n",
" <th>emp.var.rate</th>\n",
" <th>cons.price.idx</th>\n",
" <th>cons.conf.idx</th>\n",
" <th>euribor3m</th>\n",
" <th>nr.employed</th>\n",
" <th>...</th>\n",
" <th>day_of_week_fri</th>\n",
" <th>day_of_week_mon</th>\n",
" <th>day_of_week_thu</th>\n",
" <th>day_of_week_tue</th>\n",
" <th>day_of_week_wed</th>\n",
" <th>poutcome_failure</th>\n",
" <th>poutcome_nonexistent</th>\n",
" <th>poutcome_success</th>\n",
" <th>subscribed_no</th>\n",
" <th>subscribed_yes</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>56</td>\n",
" <td>261</td>\n",
" <td>1</td>\n",
" <td>999</td>\n",
" <td>0</td>\n",
" <td>1.1</td>\n",
" <td>93.994</td>\n",
" <td>-36.4</td>\n",
" <td>4.857</td>\n",
" <td>5191.0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>57</td>\n",
" <td>149</td>\n",
" <td>1</td>\n",
" <td>999</td>\n",
" <td>0</td>\n",
" <td>1.1</td>\n",
" <td>93.994</td>\n",
" <td>-36.4</td>\n",
" <td>4.857</td>\n",
" <td>5191.0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>37</td>\n",
" <td>226</td>\n",
" <td>1</td>\n",
" <td>999</td>\n",
" <td>0</td>\n",
" <td>1.1</td>\n",
" <td>93.994</td>\n",
" <td>-36.4</td>\n",
" <td>4.857</td>\n",
" <td>5191.0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>40</td>\n",
" <td>151</td>\n",
" <td>1</td>\n",
" <td>999</td>\n",
" <td>0</td>\n",
" <td>1.1</td>\n",
" <td>93.994</td>\n",
" <td>-36.4</td>\n",
" <td>4.857</td>\n",
" <td>5191.0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>56</td>\n",
" <td>307</td>\n",
" <td>1</td>\n",
" <td>999</td>\n",
" <td>0</td>\n",
" <td>1.1</td>\n",
" <td>93.994</td>\n",
" <td>-36.4</td>\n",
" <td>4.857</td>\n",
" <td>5191.0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>41183</th>\n",
" <td>73</td>\n",
" <td>334</td>\n",
" <td>1</td>\n",
" <td>999</td>\n",
" <td>0</td>\n",
" <td>-1.1</td>\n",
" <td>94.767</td>\n",
" <td>-50.8</td>\n",
" <td>1.028</td>\n",
" <td>4963.6</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>41184</th>\n",
" <td>46</td>\n",
" <td>383</td>\n",
" <td>1</td>\n",
" <td>999</td>\n",
" <td>0</td>\n",
" <td>-1.1</td>\n",
" <td>94.767</td>\n",
" <td>-50.8</td>\n",
" <td>1.028</td>\n",
" <td>4963.6</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>41185</th>\n",
" <td>56</td>\n",
" <td>189</td>\n",
" <td>2</td>\n",
" <td>999</td>\n",
" <td>0</td>\n",
" <td>-1.1</td>\n",
" <td>94.767</td>\n",
" <td>-50.8</td>\n",
" <td>1.028</td>\n",
" <td>4963.6</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>41186</th>\n",
" <td>44</td>\n",
" <td>442</td>\n",
" <td>1</td>\n",
" <td>999</td>\n",
" <td>0</td>\n",
" <td>-1.1</td>\n",
" <td>94.767</td>\n",
" <td>-50.8</td>\n",
" <td>1.028</td>\n",
" <td>4963.6</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>41187</th>\n",
" <td>74</td>\n",
" <td>239</td>\n",
" <td>3</td>\n",
" <td>999</td>\n",
" <td>1</td>\n",
" <td>-1.1</td>\n",
" <td>94.767</td>\n",
" <td>-50.8</td>\n",
" <td>1.028</td>\n",
" <td>4963.6</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>41188 rows × 65 columns</p>\n",
"</div>"
],
"text/plain": [
" age duration campaign pdays previous emp.var.rate cons.price.idx \\\n",
"0 56 261 1 999 0 1.1 93.994 \n",
"1 57 149 1 999 0 1.1 93.994 \n",
"2 37 226 1 999 0 1.1 93.994 \n",
"3 40 151 1 999 0 1.1 93.994 \n",
"4 56 307 1 999 0 1.1 93.994 \n",
"... ... ... ... ... ... ... ... \n",
"41183 73 334 1 999 0 -1.1 94.767 \n",
"41184 46 383 1 999 0 -1.1 94.767 \n",
"41185 56 189 2 999 0 -1.1 94.767 \n",
"41186 44 442 1 999 0 -1.1 94.767 \n",
"41187 74 239 3 999 1 -1.1 94.767 \n",
"\n",
" cons.conf.idx euribor3m nr.employed ... day_of_week_fri \\\n",
"0 -36.4 4.857 5191.0 ... 0 \n",
"1 -36.4 4.857 5191.0 ... 0 \n",
"2 -36.4 4.857 5191.0 ... 0 \n",
"3 -36.4 4.857 5191.0 ... 0 \n",
"4 -36.4 4.857 5191.0 ... 0 \n",
"... ... ... ... ... ... \n",
"41183 -50.8 1.028 4963.6 ... 1 \n",
"41184 -50.8 1.028 4963.6 ... 1 \n",
"41185 -50.8 1.028 4963.6 ... 1 \n",
"41186 -50.8 1.028 4963.6 ... 1 \n",
"41187 -50.8 1.028 4963.6 ... 1 \n",
"\n",
" day_of_week_mon day_of_week_thu day_of_week_tue day_of_week_wed \\\n",
"0 1 0 0 0 \n",
"1 1 0 0 0 \n",
"2 1 0 0 0 \n",
"3 1 0 0 0 \n",
"4 1 0 0 0 \n",
"... ... ... ... ... \n",
"41183 0 0 0 0 \n",
"41184 0 0 0 0 \n",
"41185 0 0 0 0 \n",
"41186 0 0 0 0 \n",
"41187 0 0 0 0 \n",
"\n",
" poutcome_failure poutcome_nonexistent poutcome_success \\\n",
"0 0 1 0 \n",
"1 0 1 0 \n",
"2 0 1 0 \n",
"3 0 1 0 \n",
"4 0 1 0 \n",
"... ... ... ... \n",
"41183 0 1 0 \n",
"41184 0 1 0 \n",
"41185 0 1 0 \n",
"41186 0 1 0 \n",
"41187 1 0 0 \n",
"\n",
" subscribed_no subscribed_yes \n",
"0 1 0 \n",
"1 1 0 \n",
"2 1 0 \n",
"3 1 0 \n",
"4 1 0 \n",
"... ... ... \n",
"41183 0 1 \n",
"41184 1 0 \n",
"41185 1 0 \n",
"41186 0 1 \n",
"41187 1 0 \n",
"\n",
"[41188 rows x 65 columns]"
]
},
"execution_count": 193,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"X"
]
},
{
"cell_type": "code",
"execution_count": 194,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.preprocessing import StandardScaler"
]
},
{
"cell_type": "code",
"execution_count": 195,
"metadata": {},
"outputs": [],
"source": [
"scaler = StandardScaler()"
]
},
{
"cell_type": "code",
"execution_count": 196,
"metadata": {},
"outputs": [],
"source": [
"scaled_X = scaler.fit_transform(X)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Creating and Fitting a KMeans Model\n",
"\n",
"Note of our method choices here:\n",
"\n",
"* fit(X[, y, sample_weight])\n",
" * Compute k-means clustering.\n",
"\n",
"* fit_predict(X[, y, sample_weight])\n",
" * Compute cluster centers and predict cluster index for each sample.\n",
"\n",
"* fit_transform(X[, y, sample_weight])\n",
" * Compute clustering and transform X to cluster-distance space.\n",
"\n",
"* predict(X[, sample_weight])\n",
" * Predict the closest cluster each sample in X belongs to."
]
},
{
"cell_type": "code",
"execution_count": 197,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.cluster import KMeans"
]
},
{
"cell_type": "code",
"execution_count": 198,
"metadata": {},
"outputs": [],
"source": [
"model = KMeans(n_clusters=2)"
]
},
{
"cell_type": "code",
"execution_count": 199,
"metadata": {},
"outputs": [],
"source": [
"# Make sure to watch video to understand this line and fit() vs transform()\n",
"cluster_labels = model.fit_predict(scaled_X)"
]
},
{
"cell_type": "code",
"execution_count": 200,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0, 0, 0, ..., 1, 1, 1])"
]
},
"execution_count": 200,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# IMPORTANT NOTE: YOUR 0s and 1s may be opposite of ours,\n",
"# makes sense, the number values are not significant!\n",
"cluster_labels"
]
},
{
"cell_type": "code",
"execution_count": 201,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"41188"
]
},
"execution_count": 201,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(scaled_X)"
]
},
{
"cell_type": "code",
"execution_count": 202,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"41188"
]
},
"execution_count": 202,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(cluster_labels)"
]
},
{
"cell_type": "code",
"execution_count": 203,
"metadata": {},
"outputs": [],
"source": [
"X['Cluster'] = cluster_labels"
]
},
{
"cell_type": "code",
"execution_count": 204,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:>"
]
},
"execution_count": 204,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAe8AAAFWCAYAAABATBVwAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAACp8UlEQVR4nOydd7xdRbn+v89JOymkEEKv0nvoIIKgYBewIlZQUe9VsVz9oeJV7NgVOyJg4aLSBEUpUqQLCZAEpEqCQCiBkN6T9/fHzE7WvHvtvfY+5+TknJN58tmfnFlr1syssvesecvzyMzIyMjIyMjI6D/oWNcDyMjIyMjIyGgPefLOyMjIyMjoZ8iTd0ZGRkZGRj9DnrwzMjIyMjL6GfLknZGRkZGR0c+QJ++MjIyMjIx+hjx5Z2RkZGRkVEDSOZKelXRvg/2SdKakRyRNlbRvYd97JD0cP+/pifHkyTsjIyMjI6Ma5wGvarL/1cCO8fMB4GcAkjYEvggcBBwIfFHSuO4OJk/eGRkZGRkZFTCzG4HZTaocC/zGAm4HxkraDHglcI2ZzTazF4BraP4S0BIGd7eBjIxWsfy5RxM6v3v2/p9kv5Sy/S1dkT6eI4ctS9tbMajp8RtttDApz35+RFJetjI9fvbKoUl5bMdyPEYMTbcNH56WVy5P34e3eVva57/OW5GU568YkpSfGJyOYcsV6TlvMNj1Pywtj9tkUbp/05VJ+et3b56U37ZicVLebu8XkvJjU8cm5cXL0/ECzLX0Pm02LG1z5Sql9ZcPS8qbj56flKcuSBclL9n2qaT81OOj68ZQhJma7vfPSVX9quM7VM9SObwzvS8r3LM6Z3F6DYZ2rErKQwal923OsrT+hp1LkvLgQenx/pyWLk/7H9Thv2vp/mGD0/4XL0/v8RA33pWuP/89WbA0fa7L6ixxfRw885L2bkwJ/G9OMwydsP0HCSvmGs4ys7Pa6G4L4PFC+Ym4rdH2biFP3hmrIelPwFZAJ/BDMztL0vuAU4E5wBRgqZl9RNIE4OfA1vHwj5vZLb0/6oyMjIzuI07U7UzW6xTZbJ5RxHvNbD9gf+AUSVsA/wscDBwK7FKo+0Pg+2Z2APAm4OyyBiV9QNIkSZPO/s0Fa3f0GRkZGUWsWtn6p/t4krD4qWHLuK3R9m5BWZgkowZJpwNviMVtgW8Au5rZe+L+U4Cd4sr7WWBm4fAJwM5mtqBR+3du8YbkYZs45bvJ/ht3/2xSHqLUNLfB0NSE7E15Q52pceWq9N10hSunrVebLgFWOfNglcl1kGvTH+/hTbDLnWm/zETbDlox+RZRNd5W4O+Dh79G7V4Df3xVf1Vmc7/fm6QXLUtdB6OcOwdg7pLUzD3KmYh9H95kPKjiHL3Lxz/7crdtmTOL+7vuj/f3veqeeIwYVu9y+q2NSspfeGPq1pp+adrHntP/3H2z+VP3t/yFGbLZrpX9SdoW+IuZ7VGy77XAR4DXEILTzjSzA2PA2mSgFn1+F7CfmTXzn1cim80zAJB0BHAUcIiZLZJ0A/AAsGuDQzqAg81sSYP9GRkZGUD9xN1bMPOv6F2HpAuAI4CNJD1BiCAfEvqxnwN/JUzcjwCLgJPivtmSvgLcGZv6cncnbsiTd8YajAFeiBP3LgRT+UjgpTGtYT7BPD4t1r8a+CjwbQBJE83snl4fdUZGRkYjrOq5ydvMTqjYb8CHG+w7BzinxwZDnrwz1uBK4EOS7gceBG4n+GW+DtxBSJF4AJgb658C/ETSVMJzdCPwoWYdeFOhN5Mfft83kvKDB56SlJ9aODIp77L1c0nZR3o/8+wGSXnLzeek43ERt889U786aNcEW1fflX19jyUrmrsCqky8Vagz87vj23ULlI3BH+NNvlVt+oj2DTqXJuWly9v72aozAbd1NDy/ZHhS3mLM/Lo6T89Nn83RzsXjzdbjx6Ym42HL0yyEeQs70+PdNVlkLjp8lYsmd4Zxn6mxymUA+GvkzfLebL5oZXoPttxwHh5Pv7Dm+/c6VjBhg/Sc7zg//b7NV9rmnnUtdgE9uPLua8iTdwYAZraUQDKQQNKkGHU+GLgU+FOs/xxwfK8OspdRNXFnZED9xJ2RTtxA3cTda+iZQLQ+iTx5Z1ThdElHEdLHriZO3hkZGRl9HnnlnbG+wsw+ta7HkJGRkdEV2MoV1ZX6KfLkndFr8IxpPhXM+7h3vuPMpHzavun+c7dMv5iz7k/ZzOY6xrSdd05NaIPGpX7FJy5NxzduVMoUBrBkaep7XLEyNa17/65nExs3NA3O7/B+9+XpmDby6W/e5+3KPr3Npzn5FKa0t+o0KX9+UO/P9fA+bu9P9W16f+5Gnel9ruqvLg6gm4xri5Te42GOVW/4gnrTrHe5LFiZPjcjl6RlzxY4z9XfyvnZBy1I79wzq9Lycpcr9qJV6Zi9j9uP1zOo+Ws0uiP1oW/xzo2T8j5fuzEpz3zl9nj86YYNk/KotbFI7sGAtb6GPHlnZGRkZAxMDGCzeY7IyUDSXyWNXdfjyMjIyOhR9C7DWq8ir7wHGCQNMrO2nkQze83aGk8RPl3Fmyd9Kpg3k190V2pGP3X/zyXlvVwK0YuUphh97rYJSXkx6WU6ZUhqnn1mXn0U8fAS1rUiPDOWF93wKUBuCOw4IRUGmTU7HUMVi5wvL3QpRuNGpGb7ZU6wwpu0V7iFyyqr/8noHJxeN2/a96h6DiZsmEYmPzgrFSrZeIg7Bzfmwc7kO3Swv2btEXcdtPvMpHzFA1sl5RePfr7umGfmpGlQm41KiQenL0jFVYa7B8Gbyec6M/m40alLZ9POtP7gYWl7Tz+Z9leVzufh6+/x44OS8nYn/Topz/rq0XVtnP+9dMwn7/54Uh5+WHpdewR55Z3RFyBpW0kPSDpf0v2SLpI0QtIMSd+UdBfwFkmvkHSbpLskXShplKRXSbqw0NYRkv4S/54haaP49ycl3Rs/Hy/0e2/h2E9FKlUknSLpX1F8/ve9eDkyMjL6CfzE3WtYuaL1Tz9Dnrz7H3YGfmpmuwLzgP+O2583s32BvwOfB46K5UnAJ+P2gyTVlnLHA8lkK2k/AqXfQQSGtZMl7VMxns8A+5jZXpSQtBSFSS5e8Fj7Z5uRkZHRVaxa1fqnnyELk/QjRFL8G81s61h+GYHpbCLwUjN7TNLrgPMImrEAQ4HbzOx9ks4CrgMuAh4Fdjez+ZJmEJTE3gGMN7MvxPa/AswCLqdAxi/pU8AoMztd0pXAAkL+95/aESZZ7ky8W205JymPdNHkX717s6T8zUlfT8or/pVGuM77wnlJecMLfpaUVy1ITdSTD0+FUsaOqKdtX7IsNRtXiVZ4jXBv8vW6yo8tS83kO4xImasWLEnb87rLHiNGpCbq6S+MScrjBqX7hznXQSsMa14L2rsO/DVa4K5Rp3MF3Ed6DV69U2peffzh1IzuRTAWuYyABc51MNKZ+atY6i4cnDKsff649BH/yaXpNQV49xapaNTkRzdNyocf/WxSnvdAevzkxzdJykPc77Qcg9ozg9PncsSqdP+2Sle+/oyrhFB8VsSFSt0Cn3tv+pz8+pf1z83bX5Jek7NvSSWtv/z8rUl53sJHuy1MsmTKX1ue4Dr3fk33VXh6Ednn3f/gH8ZaueYoFHBNAx7e3xNUb2YDk8ysnuexHCtIrTRFB9xrgcOB1wOnSdrTzPqfDSojI2OtwU/cvYbs887oQ9ha0iHx77cDN7v9twOHStoBQNJISTvFff8gyNKdjDOZR9wEHBf96CMJ8qA3Ac8AG0saL2kY8LrYdgewlZldD5xKEDdZN/JBGRkZGR4D2GyeJ+/+hweBD0cBkXFAYgs2s1nAicAFUTTkNmCXuG8l8BcCh/lffMNmdhfB5H4H8E/gbDO728yWA1+O268hCJQADAJ+J2kacDdBv3ZOD55rRkZGRtdhq1r/9DNks3n/wwoze6fbtm2xYGbXAQeUHWxmHyGYzovbti38/T3geyXHnQmc6bcDL2ll0FDvW6xLe3KqYJ4xzaeCeR/34N0OT8rPP35hUh59x5+Tsj38UNq/V58aVO0u80xVvjxKzVX
"text/plain": [
"<Figure size 432x288 with 2 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.heatmap(X.corr())"
]
},
{
"cell_type": "code",
"execution_count": 205,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"age -0.019767\n",
"duration 0.039581\n",
"campaign -0.129103\n",
"pdays -0.267714\n",
"previous 0.478493\n",
" ... \n",
"poutcome_nonexistent -0.544406\n",
"poutcome_success 0.254406\n",
"subscribed_no -0.294472\n",
"subscribed_yes 0.294472\n",
"Cluster 1.000000\n",
"Name: Cluster, Length: 66, dtype: float64"
]
},
"execution_count": 205,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"X.corr()['Cluster']"
]
},
{
"cell_type": "code",
"execution_count": 206,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:>"
]
},
"execution_count": 206,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAB8IAAAVqCAYAAACVrbSWAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAB7CAAAewgFu0HU+AAEAAElEQVR4nOzde7zldX3f+/cHB5Gr2KjBFMoYEEQtlRPUeAVKmz6SXRuNjwZzOPUSeZg0Fy9REMWemhudQKgaa04SNaK1iaQPa0wccvQYhQTEOFht7ZERMRnDNPYERRSGm8j3/LF/k1mZ7Nl7r8ueNes7z+fjsR7r+9v7u37f7x7978X3t6q1FgAAAAAAAADoxSHz3gAAAAAAAAAAzJIQDgAAAAAAAEBXhHAAAAAAAAAAuiKEAwAAAAAAANAVIRwAAAAAAACArgjhAAAAAAAAAHRFCAcAAAAAAACgK0I4AAAAAAAAAF0RwgEAAAAAAADoihAOAAAAAAAAQFeEcAAAAAAAAAC6IoQDAAAAAAAA0BUhHAAAAAAAAICuCOEAAAAAAAAAdEUIBwAAAAAAAKArQjgAAAAAAAAAXdk07w2wtqo6LMk/HC5vS/KdOW4HAAAAAAAAYJYekuRRw/jzrbX7pr2hEL4Y/mGSbfPeBAAAAAAAAMAGe0qSG6e9iUejAwAAAAAAANAVJ8IXw227B5/+9KfzmMc8Zp57AQAAAAAAAJiZr371q3nqU5+6+/K21eaulxC+GP7mO8Ef85jH5Pjjj5/nXgAAAAAAAAA2ynfWnrI2j0YHAAAAAAAAoCtCOAAAAAAAAABdEcIBAAAAAAAA6IoQDgAAAAAAAEBXhHAAAAAAAAAAuiKEAwAAAAAAANAVIRwAAAAAAACArgjhAAAAAAAAAHRFCAcAAAAAAACgK0I4AAAAAAAAAF0RwgEAAAAAAADoihAOAAAAAAAAQFeEcAAAAAAAAAC6IoQDAAAAAAAA0BUhHAAAAAAAAICuCOEAAAAAAAAAdEUIBwAAAAAAAKArQjgAAAAAAAAAXRHCAQAAAAAAAOiKEA4AAAAAAABAV4RwAAAAAAAAALoihAMAAAAAAADQFSEcAAAAAAAAgK4I4QAAAAAAAAB0RQgHAAAAAAAAoCtCOAAAAAAAAABdEcIBAAAAAAAA6IoQDgAAAAAAAEBXhHAAAAAAAAAAuiKEAwAAAAAAANAVIRwAAAAAAACArgjhAAAAAAAAAHRFCAcAAAAAAACgK0I4AAAAAAAAAF0RwgEAAAAAAADoihAOAAAAAAAAQFeEcAAAAAAAAAC6IoQDAAAAAAAA0JVN894AAAAAAAAAAItj88VbJ/7sji1LM9zJvjkRDgAAAAAAAEBXhHAAAAAAAAAAuiKEAwAAAAAAANAVIRwAAAAAAACArgjhAAAAAAAAAHRFCAcAAAAAAACgK0I4AAAAAAAAAF0RwgEAAAAAAADoyqZ5bwAAAAAAAACAyW2+eOtEn9uxZWnGOzlwOBEOAAAAAAAAQFeEcAAAAAAAAAC6IoQDAAAAAAAA0BUhHAAAAAAAAICuCOEAAAAAAAAAdEUIBwAAAAAAAKArQjgAAAAAAAAAXRHCAQAAAAAAAOiKEA4AAAAAAABAV4RwAAAAAAAAALoihAMAAAAAAADQFSEcAAAAAAAAgK4I4QAAAAAAAAB0RQgHAAAAAAAAoCtCOAAAAAAAAABdEcIBAAAAAAAA6IoQDgAAAAAAAEBXhHAAAAAAAAAAuiKEAwAAAAAAANAVIRwAAAAAAACArgjhAAAAAAAAAHRFCAcAAAAAAACgK0I4AAAAAAAAAF0RwgEAAAAAAADoihAOAAAAAAAAQFeEcAAAAAAAAAC60kUIr6oTq+qKqtpeVbuq6vaq2lZVF1bVERu05hFV9edV1YbXjo1YBwAAAAAAAIDxbJr3BqZVVc9N8r4kx4z8+IgkZw6vC6pqqbV2y4yX/oUkj53xPQEAAAAAAACY0kKfCK+qM5JcleUIfleSS5I8I8m5Sd4xTDslydaqOnrG674qyb1J7pzVfQEAAAAAAACY3qKfCH9rksOTPJDkB1prN4z87uNV9aUkl2U5hr8myZumXbCqHpLlyP6QJD+f5GVJZhbZAQAAAAAAgMW1+eKtE392x5alGe7k4LawJ8Kr6qlJnj1cvmuvCL7bFUluGsavrKpDZ7D0K5N8X5IvJvmVGdwPAAAAAAAAgBla2BCe5Hkj43evNKG19mCS9w6XxyY5Z5oFq+rELH83eJL8ZGvt/mnuBwAAAAAAAMDsLXIIf9bwvivJZ1aZd+3I+JlTrvnrSY5M8h9ba9dMeS8AAAAAAAAANsAif0f4acP7La21B1aZt32Fz4ytql6Y5IeSfCPL3zc+M1V1/BpTjpvlegAAAAAAAAA9W8gQXlUPS/LI4XLnanNba9+oql1ZPsl9woTrPSLJW4bLi1trt01yn1XcOuP7AQAAAAAAABy0FvXR6EePjO9ax/xdw/tRE653eZLvTnJDkndMeA8AAAAAAAAA9oOFPBGe5GEj4/vXMf++4f3wcReqquck+fEkDyT5ydZaG/ce67DWSfXjkmzbgHUBAAAAAAAAurOoIfzekfFD1zH/sOH9nnEWqarDkvxWkkry1tbafx/n8+vVWlv18e5VtRHLAgAAAAAAAHRpUR+NfufIeD2POz9yeF/PY9RHXZLk1Cx/h/e/HfOzAAAAAAAAAMzBQp4Ib63dW1VfT/JdSY5fbW5VPSJ7QvitYy71uuH9Y0meu4+T2bvvfWRVvXAY/3Vr7eNjrgUAAAAAAADADCxkCB98Icmzk5xcVZtaaw/sY97jR8Y3jbnG7seuv3R4reaRSX53GF+bRAgHAAAAAAAAmINFfTR6klw3vB+Z5PtWmXfWyPj6jdsOAAAAAAAAAAeCRQ7hvz8yXvG0dlUdkuRFw+UdST4xzgKttVrrleQrw/SvjPz87LH+EgAAAAAAAABmZmFDeGvt00n+dLh8WVU9fYVpr0ly2jB+a2vt26O/rKqzq6oNrys3brcAAAAAAAAA7C+L/B3hSfLKLD/u/PAkH62qS7N86vvwJC9M8vJh3s1JrpjLDgEAAAAAAADYrxY6hLfWPltV5yV5X5Jjkly6wrSbkyy11u7cr5sDAAAAAAAAYC4W9tHou7XW/jDJ6UnenOXofXeWvw/8xiSvS3JGa+2WuW0QAAAAAAAAgP1qoU+E79Za+0qSnxte43zumiQ15dqbp/k8AAAAAAAAALO18CfCAQAAAAAAAGCUEA4AAAAAAABAV4RwAAAAAAAAALoihAMAAAAAAADQFSEcAAAAAAAAgK4I4QAAAAAAAAB0RQgHAAAAAAAAoCtCOAAAAAAAAABdEcIBAAAAAAAA6IoQDgAAAAAAAEBXhHAAAAAAAAAAuiKEAwAAAAAAANCVTfPeAAAAAAAAAMBG2Xzx1ok+t2PL0ox3wv7kRDgAAAAAAAAAXRHCAQAAAAAAAOiKEA4AAAAAAABAV4RwAAAAAAAAALoihAMAAAAAAADQFSEcAAAAAAAAgK4I4QAAAAAAAAB0RQgHAAAAAAAAoCtCOAAAAAAAAABdEcIBAAAAAAAA6IoQDgAAAAAAAEBXhHAAAAAAAAAAuiKEAwAAAAAAANAVIRwAAAAAAACArgjhAAAAAAAAAHRFCAcAAAAAAACgK0I4AAAAAAAAAF0RwgEAAAAAAADoihAOAAAAAAAAQFeEcAAAAAAAAAC6IoQDAAAAAAAA0BUhHAAAAAAAAICuCOEAAAAAAAAAdEUIBwAAAAAAAKArQjgAAAAAAAAAXRHCAQAAAAAAAOiKEA4AAAAAAABAV4RwAAAAAAAAALoihAMAAAAAAADQFSEcAAAAAAAAgK4I4QAAAAAAAAB0RQgHAAAAAAAAoCtCOAAAAAAAAABdEcIBAAAAAAAA6IoQDgAAAAAAAEBXNs17AwAAAAAAAMDBYfPFWyf63I4tSzPeCb1zIhwAAAAAAACArgjhAAAAAAAAAHRFCAcAAAAAAACgK0I4AAAAAAAAAF0RwgEAAAAAAADoihAOAAAAAAAAQFeEcAAAAAAAAAC6IoQDAAAAAAA
"text/plain": [
"<Figure size 2400x1200 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"plt.figure(figsize=(12,6),dpi=200)\n",
"X.corr()['Cluster'].iloc[:-1].sort_values().plot(kind='bar')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Choosing K Value"
]
},
{
"cell_type": "code",
"execution_count": 207,
"metadata": {},
"outputs": [],
"source": [
"ssd = []\n",
"\n",
"for k in range(2,10):\n",
" \n",
" model = KMeans(n_clusters=k)\n",
" \n",
" \n",
" model.fit(scaled_X)\n",
" \n",
" #Sum of squared distances of samples to their closest cluster center.\n",
" ssd.append(model.inertia_)"
]
},
{
"cell_type": "code",
"execution_count": 212,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Text(0, 0.5, ' Sum of Squared Distances')"
]
},
"execution_count": 212,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAERCAYAAAB2CKBkAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAr60lEQVR4nO3deXiU5b3/8fc3k5U17JAABgVRFtkRxH0pFa1S60Ld6lLRq1ahWnpq29P29Px6bIuni1ZrKbhRFXcOtVRqK65shk02QWUzAWQNa4As398fM2DELJOQyTOT+byu67ky8yyTD1yQb577vp/7NndHRESSV0rQAUREJFgqBCIiSU6FQEQkyakQiIgkORUCEZEkp0IgIpLkErIQmNljZrbVzJZHef7VZrbSzFaY2TOxzicikkgsEZ8jMLOzgX3AU+7ep4ZzewDPA+e7+y4za+/uWxsip4hIIkjIOwJ3fxvYWXGfmZ1kZq+Z2UIze8fMTokcug142N13Ra5VERARqSAhC0EVJgF3ufsg4PvAI5H9JwMnm9l7ZjbPzL4aWEIRkTiUGnSA+mBmzYAzgBfM7MjujMjXVKAHcC7QGXjbzPq6e1EDxxQRiUuNohAQvrMpcvf+lRwrAOa7ewmwzszWEC4M7zdgPhGRuNUomobcfQ/hH/JXAVhYv8jh6YTvBjCztoSbitYGEFNEJC4lZCEws2eBuUBPMysws1uB64BbzWwpsAK4PHL6LGCHma0EZgMT3H1HELlFROJRQg4fFRGR+pOQdwQiIlJ/Eq6zuG3btp6Xlxd0DBGRhLJw4cLt7t6usmMJVwjy8vLIz88POoaISEIxsw1VHVPTkIhIklMhEBFJcioEIiJJToVARCTJqRCIiCS5hBs1VBfTFxcycdZqNhUVk5OdxYSRPRk9IDfoWCIicaHRF4Lpiwu57+VlFJeUAVBYVMx9Ly8DUDEQESEJmoYmzlp9tAgcUVxSxsRZqwNKJCISXxp9IdhUVFyr/SIiyabRF4Kc7KxK93dsmdnASURE4lOjLwQTRvYkKy30pf0ZqSnsOVgSQCIRkfjS6AvB6AG53H9FX3KzszAgNzuLm844gYJdxfzwpQ+CjiciErhGP2oIwsXg2BFC55zcnpPaNQsokYhI/Gj0dwRVOe+U9nRt04TycufBf3/E1r0Hg44kIhKIpC0ER6zdvp8/vfkJYybNY8tuFQMRST5JXwi6t2/GU7cOZeueQ1z957kU7DoQdCQRkQaV9IUAYEhea6beOpRdBw5zzZ/nsWHH/qAjiYg0GBWCiAFdW/HsbcMoLS/n05162ExEkkdSjBqKVp/clrw14TwyI88d7D1YQvPMtIBTiYjElu4IjnGkCPxt6SbOe+BNVmzaHXAiEZHYUiGoQt/clqSHUrj2L/NZ+mlR0HFERGJGhaAKeW2b8tztw2mRlcr1k+ezcMPOoCOJiMSECkE1urRuwnNjh9O2eQY3TFlAoWYsFZFGSJ3FNcjJzuK5scN4bcUWcquYyVREJJHpjiAK7VtkcuPwPACWF+5m9odbgw0kIlKPVAhq6devfcjYqfm8tnxL0FFEROqFCkEt/fHagfTOacmdzyzib0s3BR1HROS4qRDUUsusNP767dMZ1LUV46Yt5qWFBUFHEhE5LioEddAsI5UnbhnCsBPbMHPZZtw96EgiInWmUUN11CQ9lcduGgKAmXGotIyM1C8viSkiEu90R3AcMtNCZKaF2F1cwhWPzOEvb68NOpKISK2pENSDJukh8to05ZczV/HHNz4KOo6ISK2oaagepIVS+MOY/qSFjAf+uYZDpeXcc9HJmFnQ0UREaqRCUE9SQyn879X9SU9N4aE3PqZZRiq3n3NS0LFERGqkQlCPQinGr644jU4ts/hav5yg44iIRCVmfQRm1sXMZpvZSjNbYWbjqjl3iJmVmtmVscrTUFJSjO9ddDI52VmUlTsv5H9KebmGl4pI/IplZ3EpcK+79wKGAXeaWa9jTzKzEPBr4J8xzBKI11duYcKLH/CDlz6gTMVAROJUzAqBu29290WR13uBVUBuJafeBbwENLqZ3Eb27sj4C3vw4sICxj+3hJKy8qAjiYh8SYP0EZhZHjAAmH/M/lzg68B5wJBqrh8LjAXo2rVrzHLWNzNj/IUnk56awm9eW01JaTkPfnMA6akatSsi8SPmP5HMrBnh3/jHu/ueYw7/HvgPd6/2V2V3n+Tug919cLt27WKUNHa+c253/vPSXry5ZitrPtsbdBwRkS+I6R2BmaURLgJPu/vLlZwyGJgWGW/fFhhlZqXuPj2WuYJw65nduKRvJzq2zASgvNxJSdFzBiISvBrvCMzsJDPLiLw+18zuNrPsKK4zYAqwyt1/W9k57t7N3fPcPQ94EfhOYywCRxwpAs8u2Mj1U+az/1BpwIlERKJrGnoJKDOz7sAkoAvwTBTXjQBuAM43syWRbZSZ3WFmd9Q9cuLLSgsxb+0OvvXYAvYeLAk6jogkuWiahsrdvdTMvg485O4Pmdnimi5y93eBqNs+3P2maM9NdKMH5JIWSmHctMVcP2UBT908lJZN0oKOJSJJKpo7ghIz+ybwLeDVyD791DpOl5zWiT9dP4hVm/Zw3ZR5HCotCzqSiCSpaO4IbgbuAH7p7uvMrBswNbaxksNFvTow6cZBfPTZPq1lICKBsWhW1zKzLKCru6+OfaTqDR482PPz84OOERMfFBTRvnnm0U5lEZH6YmYL3X1wZcdqvCMws68BDwDpQDcz6w/8wt0vq9eUSe5QaRm3T11IemoKN52Rx+R31rGpqJic7CwmjOzJ6AGVPZQtInL8oukj+DkwFCgCcPclwIkxS5SkMlJDPHzdQLbsLuYXf1tJYVExDhQWFXPfy8uYvrgw6Igi0khF1Vns7ruP2adJc2JgYNdWtMhK59jGuuKSMibOCrxVTkQaqWg6i1eY2bVAyMx6AHcDc2IbK3lt33uo0v2bioobOImIJIto7gjuAnoDhwg/SLYbGB/DTEktJzurVvtFRI5XjYXA3Q+4+4/dfUhk+4m7H2yIcMlowsieZKV9cShpisGYoV0CSiQijV00cw29XnFuITNrZWazYpoqiY0ekMv9V/QlNzsLAzq0yKBFZhqT3l7L4o27go4nIo1QNH0Ebd296Mgbd99lZu1jF0lGD8j9wnDRgl0HuG7yfG6YsoDHbx7CkLzWAaYTkcYmmj6CcjM7uhqMmZ0AXxrYIjHUuVUTnhs7nPYtMvjWYwvIX78z6Egi0ohEc0fwY+BdM3uL8CRyZxFZLUwaTseWmUwbO4yfvLKcrq2bBB1HRBqRaKeYaEt4AXqAee6+PaapqtGYp5iojdKyclZs2kO/LtlBRxGRBFDdFBPRLlWZAewE9gC9zOzs+gondfPH2R9z5aNzeG35lqCjiEiCi2auoV8D1wAr+PyJYgfejmEuqcHNI7rx1ppt3PnMIv4wpj+XnpYTdCQRSVDR9BGMBnq6e+WPvEogWmalMfXW07nl8fe5+9nFlJSV8/UBnYOOJSIJKJqmobVoIZq41CwjlSduGcKwE9vw0+kr2LX/cNCRRCQBRXNHcABYYmb/JjzNBADufnfMUknUmqSn8thNQ/h46z5aNU0POo6IJKBoCsGMyCZxKjMtRJ/clgA88d46yhxuPbNbwKlEJFHUWAjc/cmGCCLHz91ZsH4nM5dtoaSsnDvOOSnoSCKSAKIZNdQDuB/oBRxdQ9HdtThNnDEzHhwzgFDKUn71jw85XFrO3Rf0CDqWiMS5aJqGHgd+BvwOOI/wYvbRPn8gDSw1lMLvr+lPeiiF376+hpKycu79Ss+gY4lIHIumEGS5+7/NzNx9A/BzM1sI/DTG2aSOQinGxCtPIz3VaKMOZBGpQTSF4JCZpQAfmdl3gUKgWWxjyfFKSTH+5+t9MTMA1m3fT16bJkffi4gcEU0TzzigCeElKgcB1wM3xjKU1I8jP/Q/3XmASx98hx9PX055uSaOFZEviqYQ5Ln7PncvcPeb3f0bQNcar5K40blVFt86I49n5m/kBy99QJmKgYhUEE0huC/KfRKnzIwJI3sy/sIevLiwgHu
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"plt.plot(range(2,10),ssd,'o--')\n",
"plt.xlabel(\"K Value\")\n",
"plt.ylabel(\" Sum of Squared Distances\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Analyzing SSE Reduction"
]
},
{
"cell_type": "code",
"execution_count": 213,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[2469792.4095956706,\n",
" 2370787.709348152,\n",
" 2271502.7007717513,\n",
" 2221128.900236805,\n",
" 2145067.141554143,\n",
" 2132468.751266735,\n",
" 2039460.8832193925,\n",
" 2005692.7454239195]"
]
},
"execution_count": 213,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ssd"
]
},
{
"cell_type": "code",
"execution_count": 217,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0 NaN\n",
"1 -99004.700248\n",
"2 -99285.008576\n",
"3 -50373.800535\n",
"4 -76061.758683\n",
"5 -12598.390287\n",
"6 -93007.868047\n",
"7 -33768.137795\n",
"dtype: float64"
]
},
"execution_count": 217,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Change in SSD from previous K value!\n",
"pd.Series(ssd).diff()"
]
},
{
"cell_type": "code",
"execution_count": 230,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:>"
]
},
"execution_count": 230,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAZIAAAD5CAYAAAANxrPXAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAASZklEQVR4nO3df7DddX3n8efLZMNYdwT8sUAJKdkxjIK2rFwj3W6nKghBOsY6aEOnY9ZFM22ha3c7LVg6pT9kJrbOMLJVdjMSDU7biPQH6RQnBrSd6a5IgqgYrHqLWJICIkFsS4VG3/vH+URPrucSyOeec/Lj+Zg5k+/3/fmc73mf3Ln3db4/7v2mqpAk6WA9a9oNSJIObwaJJKmLQSJJ6mKQSJK6GCSSpC4GiSSpyxERJElWJflSktkkV0y7H0k6muRw/z2SJIuALwOvBXYB24GLq+qeqTYmSUeJI2GPZCUwW1X3VtWTwGZg9ZR7kqSjxpEQJCcD9w+t72o1SdIELJ52A5OQZB2wDuA5z3nOWS9+8Yun3NHCunv3Ywu6vZedfOyCbg8Wvkewz4VmnwvrSOvzzjvv/EZVvXDU2JEQJLuBU4bWl7ba91TVBmADwMzMTO3YsWNy3U3AqVf81YJub8f6Cxd0e7DwPYJ9LjT7XFhHWp9Jvjbf2JFwaGs7sCLJ8iRLgDXAlin3JElHjcN+j6Sq9ia5DNgKLAI2VtXOKbclSUeNwz5IAKrqFuCWafchSUejI+HQliRpigwSSVIXg0SS1MUgkSR1MUgkSV0MEklSF4NEktTFIJEkdTFIJEldDBJJUheDRJLUxSCRJHUxSCRJXQwSSVIXg0SS1MUgkSR1MUgkSV0MEklSF4NEktTFIJEkdTFIJEldFk+7AUlHrvvWXzjtFjQB7pFIkroYJJKkLgaJJKmLQSJJ6mKQSJK6GCSSpC4GiSSpi0EiSepikEiSuhgkkqQuBokkqcvYgiTJHyT5uySfT/LnSY4bGntnktkkX0py/lB9VavNJrliqL48yadb/SNJlrT6MW19to2fOq73I0kabZx7JNuAl1bVjwJfBt4JkOR0YA1wBrAKeH+SRUkWAe8DLgBOBy5ucwHeDVxTVS8CHgUuafVLgEdb/Zo2T5I0QWMLkqr6eFXtbau3A0vb8mpgc1U9UVVfBWaBle0xW1X3VtWTwGZgdZIArwFuas/fBLxhaFub2vJNwDltviRpQiZ1juS/AR9ryycD9w+N7Wq1+erPB745FEr76vttq40/1uZLkiak634kSW4FThwxdGVV3dzmXAnsBf6o57V6JFkHrANYtmzZtNqQpCNSV5BU1blPNZ7kvwI/DZxTVdXKu4FThqYtbTXmqT8CHJdkcdvrGJ6/b1u7kiwGjm3z5/a5AdgAMDMzU3PHJUkHb5xXba0Cfh14fVU9PjS0BVjTrrhaDqwA7gC2AyvaFVpLGJyQ39IC6JPARe35a4Gbh7a1ti1fBHxiKLAkSRMwzlvt/iFwDLCtnf++vap+oap2JrkRuIfBIa9Lq+o7AEkuA7YCi4CNVbWzbetyYHOSdwF3Ade3+vXAh5PMAnsYhI8kaYLGFiTtktz5xq4Grh5RvwW4ZUT9XgZXdc2tfxt4U1+nkqQe/ma7JKmLQSJJ6mKQSJK6GCSSpC4GiSSpi0EiSepikEiSuhgkkqQuBokkqYtBIknqYpBIkroYJJKkLgaJJKmLQSJJ6mKQSJK6GCSSpC4GiSSpi0EiSepikEiSuhgkkqQuBokkqYtBIknqYpBIkrosnnYDkqSn5771F067hZHcI5EkdTFIJEldDBJJUheDRJLUxSCRJHUxSCRJXQwSSVIXg0SS1MUgkSR1GXuQJPnVJJXkBW09Sa5NMpvk80lePjR3bZKvtMfaofpZSe5uz7k2SVr9eUm2tfnbkhw/7vcjSdrfWIMkySnAecA/DJUvAFa0xzrgujb3ecBVwCuBlcBVQ8FwHfD2oeetavUrgNuqagVwW1uXJE3QuPdIrgF+Haih2mrghhq4HTguyUnA+cC2qtpTVY8C24BVbey5VXV7VRVwA/CGoW1tasubhuqSpAkZW5AkWQ3srqrPzRk6Gbh/aH1Xqz1VfdeIOsAJVfVAW34QOGGeXtYl2ZFkx8MPP3wwb0eSNI+uv/6b5FbgxBFDVwK/weCw1kRUVSWpecY2ABsAZmZmRs6RJB2criCpqnNH1ZO8DFgOfK6dF18KfCbJSmA3cMrQ9KWttht41Zz6X7f60hHzAR5KclJVPdAOgX295/1Ikp65sRzaqqq7q+o/VNWpVXUqg8NRL6+qB4EtwFva1VtnA4+1w1NbgfOSHN9Osp8HbG1j30pydrta6y3Aze2ltgD7ru5aO1SXJE3ING5sdQvwOmAWeBx4K0BV7Unye8D2Nu93q2pPW/4l4EPAs4GPtQfAeuDGJJcAXwPePIk3IEn6vokESdsr2bdcwKXzzNsIbBxR3wG8dET9EeCcBWtUkvSM+ZvtkqQuBokkqYtBIknqYpBIkroYJJKkLgaJJKmLQSJJ6mKQSJK6GCSSpC4GiSSpi0EiSepikEiSuhgkkqQuBokkqYtBIknqYpBIkrpM4w6J0iHrvvUXTrsF6bDjHokkqYtBIknqYpBIkroYJJKkLgaJJKmLQSJJ6mKQSJK6GCSSpC4GiSSpi0EiSepikEiSuhgkkqQuBokkqYtBIknqYpBIkrqMNUiS/HKSv0uyM8nvD9XfmWQ2yZeSnD9UX9Vqs0muGKovT/LpVv9IkiWtfkxbn23jp47z/UiSftDYgiTJq4HVwI9V1RnAe1r9dGANcAawCnh/kkVJFgHvAy4ATgcubnMB3g1cU1UvAh4FLmn1S4BHW/2aNk+SNEHj3CP5RWB9VT0BUFVfb/XVwOaqeqKqvgrMAivbY7aq7q2qJ4HNwOokAV4D3NSevwl4w9C2NrXlm4Bz2nxJ0oSMM0hOA36yHXL6mySvaPWTgfuH5u1qtfnqzwe+WVV759T321Ybf6zNlyRNSNc925PcCpw4YujKtu3nAWcDrwBuTPIfe17vYCVZB6wDWLZs2TRakKQjVleQVNW5840l+UXgz6qqgDuSfBd4AbAbOGVo6tJWY576I8BxSRa3vY7h+fu2tSvJYuDYNn9unxuADQAzMzP1TN+nJGl+4zy09RfAqwGSnAYsAb4BbAHWtCuulgMrgDuA7cCKdoXWEgYn5Le0IPokcFHb7lrg5ra8pa3Txj/R5kuSJqRrj+QANgIbk3wBeBJY237I70xyI3APsBe4tKq+A5DkMmArsAjYWFU727YuBzYneRdwF3B9q18PfDjJLLCHQfhIkiZobEHSrrz6+XnGrgauHlG/BbhlRP1eBld1za1/G3hTd7OSpIPmb7ZLkroYJJKkLgaJJKmLQSJJ6mKQSJK6GCSSpC4GiSSpi0EiSepikEiSuhgkkqQuBokkqYtBIknqYpBIkroYJJKkLgaJJKmLQSJJ6mKQSJK6GCSSpC4GiSSpi0EiSepikEiSuhgkkqQuBokkqYtBIknqYpBIkroYJJKkLgaJJKmLQSJJ6mKQSJK6LJ52A5KeufvWXzjtFqTvcY9EktTFIJEkdTFIJEldxhYkSc5McnuSzybZkWRlqyfJtUlmk3w+ycuHnrM2yVfaY+1Q/awkd7fnXJskrf68JNva/G1Jjh/X+5EkjTbOPZLfB36nqs4EfqutA1wArGiPdcB1MAgF4CrglcBK4KqhYLgOePvQ81a1+hXAbVW1AritrUuSJmicQVLAc9vyscA/tuXVwA01cDtwXJKTgPOBbVW1p6oeBbYBq9rYc6vq9qoq4AbgDUPb2tSWNw3VJUkTMs7Lf38F2JrkPQwC6z+3+snA/UPzdrXaU9V3jagDnFBVD7TlB4ETFrB/SdLT0BUkSW4FThwxdCVwDvA/qupPk7wZuB44t+f1nkpVVZKap891DA6jsWzZsnG1IElHpa4gqap5gyHJDcA72upHgQ+05d3AKUNTl7babuBVc+p/3epLR8wHeCjJSVX1QDsE9vV5+twAbACYmZkZGTaSpIMzznMk/wj8VFt+DfCVtrwFeEu7eut
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"pd.Series(ssd).diff().plot(kind='bar')"
]
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
}
},
"nbformat": 4,
"nbformat_minor": 1
}