You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
1240 lines
565 KiB
1240 lines
565 KiB
2 years ago
|
{
|
||
|
"cells": [
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"___\n",
|
||
|
"\n",
|
||
|
"<a href='http://www.pieriandata.com'><img src='../Pierian_Data_Logo.png'/></a>\n",
|
||
|
"___\n",
|
||
|
"<center><em>Copyright by Pierian Data Inc.</em></center>\n",
|
||
|
"<center><em>For more information, visit us at <a href='http://www.pieriandata.com'>www.pieriandata.com</a></em></center>"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"# DBSCAN Project Solutions\n",
|
||
|
"\n",
|
||
|
"## The Data\n",
|
||
|
"\n",
|
||
|
"\n",
|
||
|
"Source: https://archive.ics.uci.edu/ml/datasets/Wholesale+customers\n",
|
||
|
"\n",
|
||
|
"Margarida G. M. S. Cardoso, margarida.cardoso '@' iscte.pt, ISCTE-IUL, Lisbon, Portugal\n",
|
||
|
"\n",
|
||
|
"\n",
|
||
|
"Data Set Information:\n",
|
||
|
"\n",
|
||
|
"Provide all relevant information about your data set.\n",
|
||
|
"\n",
|
||
|
"\n",
|
||
|
"Attribute Information:\n",
|
||
|
"\n",
|
||
|
" 1) FRESH: annual spending (m.u.) on fresh products (Continuous);\n",
|
||
|
" 2) MILK: annual spending (m.u.) on milk products (Continuous);\n",
|
||
|
" 3) GROCERY: annual spending (m.u.)on grocery products (Continuous);\n",
|
||
|
" 4) FROZEN: annual spending (m.u.)on frozen products (Continuous)\n",
|
||
|
" 5) DETERGENTS_PAPER: annual spending (m.u.) on detergents and paper products (Continuous)\n",
|
||
|
" 6) DELICATESSEN: annual spending (m.u.)on and delicatessen products (Continuous);\n",
|
||
|
" 7) CHANNEL: customers Channel - Horeca (Hotel/Restaurant/Café) or Retail channel (Nominal)\n",
|
||
|
" 8) REGION: customers Region Lisnon, Oporto or Other (Nominal)\n",
|
||
|
" \n",
|
||
|
"\n",
|
||
|
"Relevant Papers:\n",
|
||
|
"\n",
|
||
|
"Cardoso, Margarida G.M.S. (2013). Logical discriminant models – Chapter 8 in Quantitative Modeling in Marketing and Management Edited by Luiz Moutinho and Kun-Huang Huarng. World Scientific. p. 223-253. ISBN 978-9814407717\n",
|
||
|
"\n",
|
||
|
"Jean-Patrick Baudry, Margarida Cardoso, Gilles Celeux, Maria José Amorim, Ana Sousa Ferreira (2012). Enhancing the selection of a model-based clustering with external qualitative variables. RESEARCH REPORT N° 8124, October 2012, Project-Team SELECT. INRIA Saclay - Île-de-France, Projet select, Université Paris-Sud 11\n",
|
||
|
"\n",
|
||
|
"\n",
|
||
|
"\n",
|
||
|
"-----"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"## DBSCAN and Clustering Examples\n",
|
||
|
"\n",
|
||
|
"**COMPLETE THE TASKS IN BOLD BELOW:**\n",
|
||
|
"\n",
|
||
|
"**TASK: Run the following cells to import the data and view the DataFrame.**"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 61,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"import numpy as np\n",
|
||
|
"import pandas as pd\n",
|
||
|
"import matplotlib.pyplot as plt\n",
|
||
|
"import seaborn as sns"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 62,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"df = pd.read_csv('../DATA/wholesome_customers_data.csv')"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 78,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/html": [
|
||
|
"<div>\n",
|
||
|
"<style scoped>\n",
|
||
|
" .dataframe tbody tr th:only-of-type {\n",
|
||
|
" vertical-align: middle;\n",
|
||
|
" }\n",
|
||
|
"\n",
|
||
|
" .dataframe tbody tr th {\n",
|
||
|
" vertical-align: top;\n",
|
||
|
" }\n",
|
||
|
"\n",
|
||
|
" .dataframe thead th {\n",
|
||
|
" text-align: right;\n",
|
||
|
" }\n",
|
||
|
"</style>\n",
|
||
|
"<table border=\"1\" class=\"dataframe\">\n",
|
||
|
" <thead>\n",
|
||
|
" <tr style=\"text-align: right;\">\n",
|
||
|
" <th></th>\n",
|
||
|
" <th>Channel</th>\n",
|
||
|
" <th>Region</th>\n",
|
||
|
" <th>Fresh</th>\n",
|
||
|
" <th>Milk</th>\n",
|
||
|
" <th>Grocery</th>\n",
|
||
|
" <th>Frozen</th>\n",
|
||
|
" <th>Detergents_Paper</th>\n",
|
||
|
" <th>Delicassen</th>\n",
|
||
|
" </tr>\n",
|
||
|
" </thead>\n",
|
||
|
" <tbody>\n",
|
||
|
" <tr>\n",
|
||
|
" <th>0</th>\n",
|
||
|
" <td>2</td>\n",
|
||
|
" <td>3</td>\n",
|
||
|
" <td>12669</td>\n",
|
||
|
" <td>9656</td>\n",
|
||
|
" <td>7561</td>\n",
|
||
|
" <td>214</td>\n",
|
||
|
" <td>2674</td>\n",
|
||
|
" <td>1338</td>\n",
|
||
|
" </tr>\n",
|
||
|
" <tr>\n",
|
||
|
" <th>1</th>\n",
|
||
|
" <td>2</td>\n",
|
||
|
" <td>3</td>\n",
|
||
|
" <td>7057</td>\n",
|
||
|
" <td>9810</td>\n",
|
||
|
" <td>9568</td>\n",
|
||
|
" <td>1762</td>\n",
|
||
|
" <td>3293</td>\n",
|
||
|
" <td>1776</td>\n",
|
||
|
" </tr>\n",
|
||
|
" <tr>\n",
|
||
|
" <th>2</th>\n",
|
||
|
" <td>2</td>\n",
|
||
|
" <td>3</td>\n",
|
||
|
" <td>6353</td>\n",
|
||
|
" <td>8808</td>\n",
|
||
|
" <td>7684</td>\n",
|
||
|
" <td>2405</td>\n",
|
||
|
" <td>3516</td>\n",
|
||
|
" <td>7844</td>\n",
|
||
|
" </tr>\n",
|
||
|
" <tr>\n",
|
||
|
" <th>3</th>\n",
|
||
|
" <td>1</td>\n",
|
||
|
" <td>3</td>\n",
|
||
|
" <td>13265</td>\n",
|
||
|
" <td>1196</td>\n",
|
||
|
" <td>4221</td>\n",
|
||
|
" <td>6404</td>\n",
|
||
|
" <td>507</td>\n",
|
||
|
" <td>1788</td>\n",
|
||
|
" </tr>\n",
|
||
|
" <tr>\n",
|
||
|
" <th>4</th>\n",
|
||
|
" <td>2</td>\n",
|
||
|
" <td>3</td>\n",
|
||
|
" <td>22615</td>\n",
|
||
|
" <td>5410</td>\n",
|
||
|
" <td>7198</td>\n",
|
||
|
" <td>3915</td>\n",
|
||
|
" <td>1777</td>\n",
|
||
|
" <td>5185</td>\n",
|
||
|
" </tr>\n",
|
||
|
" </tbody>\n",
|
||
|
"</table>\n",
|
||
|
"</div>"
|
||
|
],
|
||
|
"text/plain": [
|
||
|
" Channel Region Fresh Milk Grocery Frozen Detergents_Paper Delicassen\n",
|
||
|
"0 2 3 12669 9656 7561 214 2674 1338\n",
|
||
|
"1 2 3 7057 9810 9568 1762 3293 1776\n",
|
||
|
"2 2 3 6353 8808 7684 2405 3516 7844\n",
|
||
|
"3 1 3 13265 1196 4221 6404 507 1788\n",
|
||
|
"4 2 3 22615 5410 7198 3915 1777 5185"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 78,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"df.head()"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 79,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"<class 'pandas.core.frame.DataFrame'>\n",
|
||
|
"RangeIndex: 440 entries, 0 to 439\n",
|
||
|
"Data columns (total 8 columns):\n",
|
||
|
" # Column Non-Null Count Dtype\n",
|
||
|
"--- ------ -------------- -----\n",
|
||
|
" 0 Channel 440 non-null int64\n",
|
||
|
" 1 Region 440 non-null int64\n",
|
||
|
" 2 Fresh 440 non-null int64\n",
|
||
|
" 3 Milk 440 non-null int64\n",
|
||
|
" 4 Grocery 440 non-null int64\n",
|
||
|
" 5 Frozen 440 non-null int64\n",
|
||
|
" 6 Detergents_Paper 440 non-null int64\n",
|
||
|
" 7 Delicassen 440 non-null int64\n",
|
||
|
"dtypes: int64(8)\n",
|
||
|
"memory usage: 27.6 KB\n"
|
||
|
]
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"df.info()"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"## EDA\n",
|
||
|
"\n",
|
||
|
"**TASK: Create a scatterplot showing the relation between MILK and GROCERY spending, colored by Channel column.**"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 67,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"#CODE HERE"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 68,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"<AxesSubplot:xlabel='Milk', ylabel='Grocery'>"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 68,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
},
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAZEAAAEGCAYAAACkQqisAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAABJ3klEQVR4nO3dd3hUVfrA8e+ZmjbpvZEQOohIU6xYULF3UVdxVfztiru66lpWXeu6ir0rit0V17a6duxlUWlKL6GEJKQnpE8/vz/mMiQkQBJSJvB+nmeezJy598w7JMw791SltUYIIYToClNfByCEEKL/kiQihBCiyySJCCGE6DJJIkIIIbpMkogQQogus/R1AL0tMTFR5+Tk9HUYQgjRbyxatKhSa53U3nP7XBLJyclh4cKFfR2GEEL0G0qpgp09J81ZQgghukySiBBCiC6TJCKEEKLL9rk+ESGE6Ckej4eioiKcTmdfh9IlYWFhZGZmYrVaO3yOJBEhhOgmRUVFOBwOcnJyUEr1dTidorWmqqqKoqIicnNzO3yeJBEhxE5tzC9g9cp8/H4/w0YOJm9wTl+HFNKcTme/TCAASikSEhKoqKjo1HmSRIQQ7VqzMp9Lp11NXW09AI7oKJ5/42GGjxrSx5GFtv6YQLbpSuzSsS6EaNen//0qmEAA6usaeP/tT/swIhGKJIkIIdq1cf3mNmUb1u10zpnYhdLSUqZNm0ZeXh7jxo3jhBNOYPbs2Zx00kl9FtPkyZO7ZeK1JBEhRLtOPO2YNmWnnTO1DyLp37TWnH766UyePJn169ezaNEi/vnPf1JWVtbXoXULSSJCiHZNnDSWG2//M9ExDhzRUVx78xVMOmxCX4fV73z99ddYrVb+8Ic/BMv2339/DjvsMBoaGjjrrLMYNmwYF1xwAdt2mr3zzjuZMGECo0aN4vLLLw+WT548mRtuuIGJEycyZMgQvv/+ewBeeuklzjjjDI4//ngGDx7M9ddfH3ytzz//nEmTJjF27FjOPvtsGhoauvX9SRIRQrQrJi6a839/Ju/Oe4n35r3M9MvPJS4+pq/D6neWL1/OuHHj2n1uyZIlPPLII6xcuZINGzbw448/AnDllVeyYMECli9fTnNzMx9++GHwHK/Xyy+//MIjjzzCHXfcESz/9ddfefPNN1m2bBlvvvkmhYWFVFZWcvfdd/PFF1+wePFixo8fz0MPPdSt709GZwkhdik5JbGvQ9hrTZw4kczMTADGjBnDpk2bOPTQQ/n666+ZNWsWTU1NVFdXM3LkSE4++WQAzjjjDADGjRvHpk2bgnUdffTRxMQEkvyIESMoKChg69atrFy5kkMOOQQAt9vNpEmTuvU9SBIRQogeNHLkSN5+++12n7Pb7cH7ZrMZr9eL0+nkiiuuYOHChWRlZXH77be3mgG/7Zxtx++qLq01U6ZM4Y033ujutxUkzVlCCNGDjjrqKFwuF7Nnzw6WLV26NNifsaNtCSMxMZGGhoadJqCOOOigg/jxxx/Jz88HoLGxkbVr13a5vvZIEhFCiB6klOK9997jiy++IC8vj5EjR3LTTTeRmpra7vGxsbHMmDGDUaNGcdxxxzFhQtcHMyQlJfHSSy9x3nnnMXr0aCZNmsTq1au7XF971LZe/33F+PHjtWxKJYToCatWrWL48OF9HcYeae89KKUWaa3Ht3e8XIkIIYToMkkiQgghukySiBBCiC6TJCKEEKLLJIkIIYToMkkiQgghukySiBBC7MUuueQSkpOTGTVqVI/UL0lECCH2YhdffDGfftpzm4nJ2llCCBEiXDVVNJcW4/e4MVlthKdmYI9L2KM6Dz/88FYLNXY3SSJCCBECXDVVNBYVgPYD4Pe4A49hjxNJT5LmLCGECAHNpcXBBBKk/YHyECZJRAghQoDf4+5UeaiQJCKEECHAZLV1qjxUSBIRQogQEJ6aAWqHj2RlCpTvgfPOO49JkyaxZs0aMjMzmTNnzh7VtyPpWBdCiBCwrfO8u0dn9eSuhtDDVyJKqb8opVYopZYrpd5QSoUppXKVUj8rpfKVUm8qpWzGsXbjcb7xfE6Lem4yytcopY5rUX68UZavlLqxJ9+LEEL0NHtcArHDRxM/ejyxw0eH9KisbXosiSilMoA/A+O11qMAMzANuA94WGs9CKgBLjVOuRSoMcofNo5DKTXCOG8kcDzwlFLKrJQyA08CU4ERwHnGsUIIIXpJT/eJWIBwpZQFiABKgKOAbZsGvwycZtw/1XiM8fzRSilllM/VWru01huBfGCiccvXWm/QWruBucaxQgghekmPJRGtdTHwALCZQPKoBRYBW7XWXuOwImBbr1EGUGic6zWOT2hZvsM5OytvQyl1uVJqoVJqYUVFxZ6/OSGEEEDPNmfFEbgyyAXSgUgCzVG9Tms9W2s9Xms9PikpqS9CEEKIvVJPNmcdA2zUWldorT3Au8AhQKzRvAWQCWybjlkMZAEYz8cAVS3LdzhnZ+VCCCF6SU8mkc3AQUqpCKNv42hgJfA1cJZxzHTgfeP+B8ZjjOe/0lpro3yaMXorFxgM/AIsAAYbo71sBDrfP+jB9yOEEP1KYWEhRx55JCNGjGDkyJE8+uij3f4aPTZPRGv9s1LqbWAx4AWWALOBj4C5Sqm7jbJtM1/mAK8qpfKBagJJAa31CqXUvwkkIC8wU2vtA1BKXQl8RmDk1wta6xU99X6EEKK/sVgsPPjgg4wdO5b6+nrGjRvHlClTGDGi+way9uhkQ631bcBtOxRvIDCyasdjncDZO6nnH8A/2in/GPh4zyMVQoi+99F/5vHYrOco3VJOanoyf75+BieeNqXL9aWlpZGWlgaAw+Fg+PDhFBcX958kIoQQomM++s887rjxfpzNLgBKisu448b7AfYokWyzadMmlixZwoEHHrjHdbUka2cJIUQIeGzWc8EEso2z2cVjs57b47obGho488wzeeSRR4iOjt7j+lqSJCKEECGgdEt5p8o7yuPxcOaZZ3LBBRdwxhln7FFd7ZEkIoQQISA1PblT5R2htebSSy9l+PDhXHPNNV2uZ1ckiQghRAj48/UzCAu3tyoLC7fz5+tndLnOH3/8kVdffZWvvvqKMWPGMGbMGD7+uHvHIknHuhBChIBtnefdOTrr0EMPJTDdrudIEhFCiBBx4mlTumUkVm+S5iwhhBBdJklECCG6UU83H/WkrsQuSUQIIbpJWFgYVVVV/TKRaK2pqqoiLCysU+dJn4gQQnSTzMxMioqK6K/7FoWFhZGZmdmpcySJCCFEN7FareTm5vZ1GL1KmrOEEEJ0mSQRIYQQXSZJRAghRJdJEhFCCNFlkkSEEEJ0mSQRIYQQXSZDfPuAy+WmYEMhzc1OMrPTSEiM7+uQhBCiSySJ9LLamjpemj2XF595A7/fT+6gATzw5O0MHjawr0MTQohOk+asXrZi2WrmPPU6fr8fgI35BTz9yEu4XK7dnCmEEKFHkkgv27yxuE3Z/O8XsLW6rg+iEUKIPSNJpJelZ6W2KTtgwn5Exzr6IBohhNgzkkR62ajRwzhz2knBx4nJ8fzpuhmEh3du5UwhhAgFqj8uWbwnxo8frxcuXNinMTQ2NLEhv4CmxiYG5GaRmp7cp/EIIcSuKKUWaa3Ht/ecjM7qA5FREew3ZnhfhyGEEHtMmrOEEEJ0mSQRIYQQXSZJRAghRJdJEhFCCNFlkkSEEEJ0mSQRIYQQXSZJRAghRJdJEhFCCNFlPZpElFKxSqm3lVKrlVKrlFKTlFLxSql5Sql1xs8441illHpMKZWvlFqqlBrbop7pxvHrlFLTW5SPU0otM855TCmlevL9CNEXXC43+9rKEqL/6OkrkUeBT7XWw4D9gVXAjcCXWuvBwJfGY4CpwGDjdjnwNIBSKh64DTgQmAjcti3xGMfMaHHe8T38foToNUWbtzDnqdeZfuaVPHTPM6xft6mvQxKijR5b9kQpFQMcDlwMoLV2A26l1KnAZOOwl4FvgBuAU4FXdOAr10/GVUyacew8rXW1Ue884Hil1DdAtNb6J6P8FeA04JOeek9C9JbGhibuu/1xvv3yfwCsXLaGb+b
|
||
|
"text/plain": [
|
||
|
"<Figure size 432x288 with 1 Axes>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {
|
||
|
"needs_background": "light"
|
||
|
},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"sns.scatterplot(data=df,x='Milk',y='Grocery',hue='Channel')"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"**TASK: Use seaborn to create a histogram of MILK spending, colored by Channel. Can you figure out how to use seaborn to \"stack\" the channels, instead of have them overlap?**"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"#CODE HERE"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 73,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"<AxesSubplot:xlabel='Milk', ylabel='Count'>"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 73,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
},
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAEGCAYAAACKB4k+AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAZEUlEQVR4nO3deZRedZ3n8ffXLERMQkiIGFNAIm4kKJiULIKcAM2IaVxYmgkHMEKU47A0DjPdwjgOQW2PMmonIio5IuDSLE3rEBmnVTD0OHR3MKxZ2CImUhHIIhjCnu7v/PHcXCqpqtRTT56tqPfrnOfUvb97n3u/xVPh89z7u/d3IzORJAngda0uQJLUPgwFSVLJUJAklQwFSVLJUJAklYa3uoBdsddee+WUKVNaXYYkDSp33333xsyc2NuyQR0KU6ZMYdmyZa0uQ5IGlYhY29cyTx9JkkqGgiSpZChIkkqDuk9BkhrplVdeoaurixdffLHVpdRk1KhRdHR0MGLEiKrfYyhIUh+6uroYM2YMU6ZMISJaXc6AZCabNm2iq6uLqVOnVv0+Tx9JUh9efPFFJkyYMOgCASAimDBhwoCPcgwFSdqJwRgI29RSu6EgSSoZCpI0AE8++SRz5sxh//33Z+bMmcyePZtFixZxwgkntKymWbNm1e1G3iHb0XzBeeeyZfPmHu2jx47liiu/1YKKJLW7zOTEE09k7ty53HDDDQDcf//9LF68uMWV1c+QDYUtmzezcP7nerRfOP8LLahG0mCwZMkSRowYwac+9amy7aCDDuLpp5/m9ttv55RTTmHFihXMnDmTH/7wh0QEn//85/npT3/KCy+8wPve9z6uuuoqIoJZs2Zx6KGHsmTJEp555hmuvvpq3v/+93PttdeyePFinn/+eX77299y4okncvnllwPwi1/8gksvvZSXXnqJ/fffn2uuuYbRo0fX9Xf09JEkVWnb//B7c++997JgwQJWrVrFY489xp133gnA+eefz29+8xtWrFjBCy+8wK233lq+Z+vWrdx1110sWLCAyy67rGy/7777uPHGG1m+fDk33ngjjz/+OBs3buSLX/wit912G/fccw+dnZ18/etfr/vvOGSPFCSpng455BA6OjoAOPjgg1mzZg1HHnkkS5Ys4fLLL+f555/nj3/8I9OnT+dDH/oQACeddBIAM2fOZM2aNeW2jj32WPbYYw8Apk2bxtq1a3nmmWdYtWoVRxxxBAAvv/wyhx9+eN1/D0NBkqo0ffp0br755l6X7bbbbuX0sGHD2Lp1Ky+++CLnnnsuy5YtY5999mH+/Pnb3Tew7T3b1t/ZtjKT4447juuvv77ev9Z2PH0kSVU65phjeOmll1i0aFHZ9sADD/DrX/+61/W3BcBee+3Fli1b+gyUahx22GHceeedrF69GoDnnnuORx55pObt9cVQkKQqRQQ/+clPuO2229h///2ZPn06l1xyCW9605t6XX/cuHF88pOf5MADD+QDH/gA733ve2ve98SJE7n22ms57bTTePe7383hhx/OQw89VPP2+hKZWfeNNktnZ2fWem3uWWee0efVR9f84Ie7Wpqk14AHH3yQAw44oNVl7JLefoeIuDszO3tb3yMFSVLJUJAklQwFSVLJUJAklQwFSVLJUJAklRoWChHxvYhYHxErurWNj4hfRsSjxc89i/aIiG9ExOqIeCAiZjSqLkmqhyn77UdE1O01Zb/9+t3n2WefzRvf+EYOPPDAhv1ejRzm4lrgm8D3u7VdDNyemV+OiIuL+c8AHwTeVrwOBb5d/JSktrT297/nT6vrd/PYHm99Z7/rfPzjH+f888/nYx/7WN32u6OGHSlk5v8F/rhD80eA64rp64CPdmv/flb8KzAuIiY1qjZJGoyOOuooxo8f39B9NLtPYe/MfKKYfhLYu5ieDDzebb2uoq2HiDgnIpZFxLINGzY0rlJJGoJa1tGclfE1BjzGRmYuyszOzOycOHFiAyqTpKGr2aHw1LbTQsXP9UX7OmCfbut1FG2SpCZqdigsBuYW03OBW7q1f6y4Cukw4E/dTjNJkpqkYVcfRcT1wCxgr4joAi4FvgzcFBHzgLXAqcXqPwNmA6uB54GzGlWXJNXDfvvuW9UVQwPZXn9OO+007rjjDjZu3EhHRweXXXYZ8+bNq1sN0MBQyMzT+lh0bC/rJnBeo2qRpHpbs3Zt0/fZ6KeugXc0S5K6MRQkSSVDQZJUMhQkSSVDQZJUMhQkSSVDQZJqMHlyR12Hzp48uaPffT7++OMcffTRTJs2jenTp7Nw4cK6/16NHDpbkl6z/vCHdZx96l/WbXvfu+kb/a4zfPhwvva1rzFjxgyeffZZZs6cyXHHHce0adPqVodHCpI0SEyaNIkZMyrPIBszZgwHHHAA69bVd5g4Q0GSBqE1a9Zw7733cuih9X0emaEgSYPMli1bOPnkk1mwYAFjx46t67YNBUkaRF555RVOPvlkTj/9dE466aS6b99QkKRBIjOZN28eBxxwABdddFFD9uHVR5JUgze/eXJVVwwNZHv9ufPOO/nBD37Au971Lg4++GAAvvSlLzF79uy61WEoSFIN1q3ravo+jzzySCpPGmgcTx9JkkqGgiSpZChI0k40+nRNI9VSu6EgSX0YNWoUmzZtGpTBkJls2rSJUaNGDeh9djRLUh86Ojro6upiw4YNrS6lJqNGjaKjo/+B9rozFCSpDyNGjGDq1KmtLqOpPH0kSSoZCpKkkqEgSSoZCpKkkqEgSSoZCpKkkqEgSSoZCpKkUktCISL+c0SsjIgVEXF9RIyKiKkRsTQiVkfEjRExshW1SdJQ1vRQiIjJwF8CnZl5IDAMmAN8BfjbzHwr8DQwr9m1SdJQ16rTR8OB10fEcGB34AngGODmYvl1wEdbU5okDV1ND4XMXAd8Ffg9lTD4E3A38Exmbi1W6wJ6fTZdRJwTEcsiYtlgHaRKktpVK04f7Ql8BJgKvBl4A3B8te/PzEWZ2ZmZnRMnTmxQlZI0NLVilNQ/A36XmRsAIuLHwBHAuIgYXhwtdADrWlAbK1au5Kwzz+jRPnrsWK648lstqEiSmqcVofB74LCI2B14ATgWWAYsAU4BbgDmAre0oDZGDhvGwvmf69F+4fwvtKAaSWquVvQpLKXSoXwPsLyoYRHwGeCiiFgNTACubnZtkjTUteQhO5l5KXDpDs2PAYe0oBxJUsE7miVJJUNBklQyFCRJJUNBklQyFCRJJUNBklQyFCRJJUNBklQyFCRJJUNBklQyFCRJJUNBklQyFCRJJUNBklQyFCRJJUNBklQyFCRJJUNBklQyFCRJJUNBklQyFCRJJUNBklQa3uoCBosVK1dy1pln9Lps9NixXHHlt5pckSTVn6FQpZHDhrFw/ud6XXbh/C80uRpJagxPH0mSSlWFQkQcUU2bJGlwq/ZI4Yoq2yRJg9hO+xQi4nDgfcDEiLio26KxwLBGFiZJar7+OppHAqOL9cZ0a98MnNKooiRJrbHTUMjMfwL+KSKuzcy19dppRIwDvgscCCRwNvAwcCMwBVgDnJqZT9drn5Kk/lXbp7BbRCyKiF9ExK+2vXZhvwuBf8zMdwIHAQ8CFwO3Z+bbgNuLeUlSE1V7n8LfA9+h8u3+33ZlhxGxB3AU8HGAzHwZeDkiPgLMKla7DrgD+Myu7EuSNDDVhsLWzPx2nfY5FdgAXBMRBwF3AxcCe2fmE8U6TwJ79/bmiDgHOAdg3333rVNJkiSo/vTRTyPi3IiYFBHjt71q3OdwYAbw7cx8D/AcO5wqysyk0tfQQ2YuyszOzOycOHFijSVIknpT7ZHC3OLnX3VrS+AtNeyzC+jKzKXF/M1UQuGpiJiUmU9ExCRgfQ3bliTtgqpCITOn1muHmflkRDweEe/IzIeBY4FVxWsu8OXi5y312qckqTpVhUJEfKy39sz8fo37vQD4UUSMBB4DzqJyKuumiJgHrAVOrXHbkqQaVXv66L3dpkdR+XZ/D1BTKGTmfUBnL4uOrWV7kqT6qPb00QXd54ubz25oREGSpNapdejs56hcWipJeg2ptk/hp7x6iegw4AD
|
||
|
"text/plain": [
|
||
|
"<Figure size 432x288 with 1 Axes>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {
|
||
|
"needs_background": "light"
|
||
|
},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"sns.histplot(df,x='Milk',hue='Channel',multiple=\"stack\")"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"**TASK: Create an annotated clustermap of the correlations between spending on different cateogires.**"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 85,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"# CODE HERE"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 86,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"Correlation Between Spending Categories\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAsgAAALJCAYAAACp99XTAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAABtYklEQVR4nO3dd5hU5d3G8e9vlypN6UVUFNRYQVeMHQt2o7FHjZpgiS2JLSoaxd5NMZaYYkli1ERfWzT2Fguwir2h2JAmvUjdfd4/doCzFGVV9sws38917eWcc57Zvc+4y977zDNnIqWEJEmSpBpleQeQJEmSiokFWZIkScqwIEuSJEkZFmRJkiQpw4IsSZIkZViQJUmSpIwGXZAj4q8RMS4i3lzK8YiI30fEBxHxekRsUt8ZJUmSVFwadEEGbgF2/YrjuwG9Ch/HADfUQyZJkiQVsQZdkFNKzwITv2LI3sBtqcZLwMoR0aV+0kmSJKkYNeiCvAy6AZ9ltkcW9kmSJGkF1agug+eOH1FU70vdpMNax1KzNGK+m1JKN+WVR5IkSaWvTgWZ6qrlFOObKZThb1OIPwe6Z7ZXLeyTJEnSCqpuBblq3nKKkZv7gRMj4g5gc2BKSml0zpm0AquoqDgTaJZ3Dknf2KzKysrL8g4h6dupU0FOJVaQI+KfQD+gfUSMBM4DGgOklG4EHgJ2Bz4AvgR+kk9SaYFmlZWVg/IOIembqaioGJR3BknfXh1nkOcupxjLR0rpR19zPAEn1FMcSZIklYAVfYmFJEmSVEuDXmIhSZIk1VWDXmIhSZIk1ZVLLCRJkqQMC7IkSZKUUbc1yNUusZAkSVLD5gyyJEmSlOGL9CRJkqQMZ5AlSZKkjLoV5HkWZEmSJDVsdXyjEJdYSJIkqWFziYUkSZKUYUGWJEmSMizIkiRJUkaDLsgRsSvwO6Ac+HNK6bJFjq8G3AqsXBhzZkrpofrOKUmSpOLRYAtyRJQD1wH9gZHA0Ii4P6X0dmbYOcBdKaUbImI94CFgjXoPK0mSpKLRYAsy0Bf4IKU0AiAi7gD2BrIFOQGtC7fbAKPqNaEkSZKKTkO+DnI34LPM9khg80XGDAIejYiTgBbATvUTTZIkScWqjjPIVcspxjcTEccAx2R23ZRSuqkOn+JHwC0ppasjYgvgbxGxQUqp+jsNKkmSpJJR0kssCmV4aYX4c6B7ZnvVwr6sAcCuhc/1YkQ0A9oD477jqJIkSSoRJT2D/DWGAr0iogc1xfhg4JBFxnwK7AjcEhHfA5oBX9RrSkmSJBWVBrsGOaU0LyJOBB6h5hJuf00pvRURFwCVKaX7gVOBP0XEydS8YO/IlFLKL7UkSZLyVqeCnEprBpnCNY0fWmTfuZnbbwNb1XcuSZIkFa+SXoMsSZIkfdfquMSitGaQJUmSpLpqyC/SkyRJkurMGWRJkiQpwxlkSZIkKaNuV7FwBlmSJEkNnDPIkiRJUoZrkCVJkqQMZ5AlSZKkDNcgS5IkSRnOIEuSJEkZdZxBrl5eOSRJkqSi4Iv0JEmSpIy6FeTqtJxiSJIkScWhrC6DU1V1UX18nYjYNSLei4gPIuLMpYw5MCLejoi3IuL2ujwekiRJanjquMSidNYgR0Q5cB3QHxgJDI2I+1NKb2fG9ALOArZKKU2KiI75pJUkSVKxaMgv0usLfJBSGgEQEXcAewNvZ8YcDVyXUpoEkFIaV+8pJUmSVFTqWJBLag1yN+CzzPZIYPNFxqwNEBHPA+XAoJTSf+snniRJkopRHZdYFFdBjohjgGMyu25KKd1Uh0/RCOgF9ANWBZ6NiA1TSpO/s5CSJEkqKSU9g1wow0srxJ8D3TPbqxb2ZY0EBqeU5gIfRcT71BTmod91VklaEVVUVJwJNMs7Rz1ao6KiYlDeIerRrMrKysvyDiF910q6IH+NoUCviOhBTTE+GDhkkTH3Aj8Cbo6I9tQsuRhRnyElqYFrVllZOSjvEFo+VrA/BrQCqWNBXl4xvnsppXkRcSLwCDXri/+aUnorIi4AKlNK9xeO7RwRbwNVwOkppQn5pZYkSVLeGmxBBkgpPQQ8tMi+czO3E3BK4UOSJEmqW0GuLrGCLEmSJNVV3WaQq2J55ZAkSZKKQh1nkC3IkiRJatjqVpCdQZYkSVID5wyyJElFqgSuI13s1332Os36RpxBliSpeHkd6W+hyMu7ipgFWZIkScqoY0EuW145JEmSpKJQp4Jc5QyyJEmSGjhnkCVJkqQM1yBLkiRJGXVbYlHtDLIkSZIatjoWZGeQJUmS1LDVbYmFBVmSJEkNXJ3WTFRVlxXVx9eJiF0j4r2I+CAizvyKcftFRIqIiro8HpIkSWp46rbEIpXODHJElAPXAf2BkcDQiLg/pfT2IuNaAb8ABtd/SkmSJBWbhvwivb7ABymlEQARcQewN/D2IuMuBC4HTq/feJIkSSpGdVtiQRTVx9foBnyW2R5Z2LdARGwCdE8p/acuj4MkSZIarjrNIM8rsiUWEXEMcExm100ppZuW8b5lwDXAkcshmiRJkkpU3ZZYfP2sbb0qlOGlFeLPge6Z7VUL++ZrBWwAPB0RAJ2B+yPiBymlyuUQV5IkSSWgpAvy1xgK9IqIHtQU44OBQ+YfTClNAdrP346Ip4HTLMeSJEkrtgZbkFNK8yLiROARoBz4a0rprYi4AKhMKd2fb0JJkiQVo7qtQY7SKcgAKaWHgIcW2XfuUsb2q49MkiRJKm51nEGWJEmSGra6FeQSm0GWJEmS6qqOSyyWVwxpySoqKs4EmuWdox6tUVFRMSjvEPVoVmVl5WV5h5AkKavBvkhPDUazysrKQXmH0PKxgv0xIEkqEc4gS5IkSRl1XIO8vGJIkiRJxaFuM8jLK4UkSZJUJJxBliRJkjLqVJCrl1cKSZIkqUg4gyxJkiRluAZZkiRJynAGWZIkScqo4xuFSJIkSQ1bHZdYpOWVQ5IkSSoKZXUZXFVkH18nInaNiPci4oOIOHMJx0+JiLcj4vWIeCIiVq/L4yFJkqSGp45vNV06M8gRUQ5cB/QHRgJDI+L+lNLbmWHDgIqU0pcRcRxwBXBQ/aeVJElSsWjIa5D7Ah+klEYARMQdwN7AgoKcUnoqM/4l4LB6TShJkqSiU8eCXDozyEA34LPM9khg868YPwB4eLkmkiRJUtEr6RfpRcQxwDGZXTellG76Bp/nMKAC2O67yiZJkqTSVNJLLApleGmF+HOge2Z71cK+WiJiJ+BsYLuU0uzvPKQkSZJKSkNeYjEU6BURPagpxgcDh2QHREQf4I/ArimlcfUfUZIkScWmwRbklNK8iDgReAQoB/6aUnorIi4AKlNK9wNXAi2Bf0UEwKcppR/kFlqSJEm5K+k1yF8npfQQ8NAi+87N3N6p3kNJkiSpqDXYGWRJkiTpm7AgS5IkSRl1K8jJgixJkqSGrY4zyNXLK4ckSZJUFBr0i/QkSZKkunINstSAVVRUnAk0yzvHV1ijoqJiUN4hvsasysrKy/IOIUmqP3Vcg+wSC6nENKusrByUd4hSVgIFXpL0HXMGWZIkScpwBlmSJEnKcAZZkiRJynAGWZIkScrwOsiSJElShu+kJ0mSJGXU8Y1CnEGWJElSw9ag1yBHxK7A74By4M8ppcsWOd4UuA3YFJgAHJRS+ri+c0qSJKl41KkgV5dQQY6IcuA6oD8wEhgaEfenlN7ODBsATEop9YyIg4HLgYPqP60kSZKKRUOeQe4LfJBSGgEQEXcAewPZgrw3MKhw+9/AHyIiUnKxtSRJ0oqqIV/FohvwWWZ7JLD50saklOZFxBSgHTC+XhJKkiSp6JT0DHJEHAMck9l1U0rpprzySJIkqfSVdEEulOGlFeLPge6Z7VUL+5Y0ZmRENALaUPNiPUmSJK2goi7LbXt12LSo1uYO/+LlWNqxQuF9H9iRmiI8FDgkpfRWZswJwIYppZ8VXqS3b0rpwOUcu6jMHT+iqP6flpq5t1ySd4SSVj1qXN4RStortzfNO0LJu7rJzLwjlLRmUad5Ni3BnZ/cu9Quo/yU9AzyVymsKT4ReISay7z9NaX0VkRcAFSmlO4H/gL
|
||
|
"text/plain": [
|
||
|
"<Figure size 720x720 with 4 Axes>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {
|
||
|
"needs_background": "light"
|
||
|
},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"print('Correlation Between Spending Categories')\n",
|
||
|
"sns.clustermap(df.drop(['Region','Channel'],axis=1).corr(),annot=True);"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"**TASK: Create a PairPlot of the dataframe, colored by Region.**"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"#CODE HERE"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 75,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"<seaborn.axisgrid.PairGrid at 0x2d711759c40>"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 75,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
},
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAABQ0AAATXCAYAAABEVlNkAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAEAAElEQVR4nOzdd3hcV53/8feZ3tW75F7iGsdxekhCEpIQSuidhBqWGsqy9M3S+W0BFlhKWAKE3kMgEAgsECANp9pOHBfJlmWra9Sml/P7Q7JiRXKTJc1I+ryeZx5b59577veOjmbufOcUY61FRERERERERERE5DBHoQMQERERERERERGR4qKkoYiIiIiIiIiIiIyjpKGIiIiIiIiIiIiMo6ShiIiIiIiIiIiIjKOkoYiIiIiIiIiIiIyjpKGIiIiIiIiIiIiMo6ThqKuuusoCeugxE49poTaqxww+poXaqB4z+JgWaqN6zOBjWqiN6jGDj2mhNqrHDD6mhdqoHjP4WJCUNBzV09NT6BBEjkltVIqd2qgUO7VRKXZqo1Ls1Eal2KmNikwvJQ1FRERERERERERknDmXNDTGNBlj/mSMecwYs8MYc8Mk+xhjzBeMMXuMMY8aYzYXIlYREREREREREZG5yFXoAKYgC7zHWvugMSYMPGCMudNa+9gR+zwTWDn6OAf4yui/IgLs6d1Lb6qb3mQPJd4yKn0VdCW6GEgPUB+ox+P0cGD4AF6njxpfIwPREvpicRbXZBmmm/70ANX+aoYyg+RsntpALTUHhjB7mklXl9OzqAQbDOB1BElkcnTEDuFyOlhRtpwSX5Dmgb3E0jEaw41U+CtpHdxPX7KPmmAtEU+E5oG9JDIJKv2VVPqrsOTZP7gfn8vPspJlVPgrCv0UygzqHBimdaiV7mQ71cFy0rkkQ5kBSnwlOHGRzqdwOVykcikinghep5cDQwdwYKgP1BPPxulL9eN3+Qh7I7QPdxBxlxLxlDCcHWQg1U/YFWZxV5pACtxr1xIuq2YoNcyuzkEO9MTxeWBRqaG2tQ1XezcsW0VLSR2tA3FCgTylkTgRINTcgbe9F0dNNYcag/R50iwtWUpdqH7ya4t30jLQQj6foyZYS3eim1w+y6LIYprCTcd8XnoSPbQMNJPMJmkKL2JxZDHGmKPun86l2TfQQnusnRJvKctKlhHxRk7lV3NCBlIDtAw0M5AaoC5Uz9LIUtxO94yftxgMJzLs7hyiP56grLyPjkQbAXcQn9OHwxhcxkVnvAOP00uFv4JkNkl/qh+f04fH6cNl3CTjg7gdHsK+arozhxjODlEbqMVrPAzmhhhI9VPijlBODT0D4AwP0pfqo8RTTqmnmqTpZDA9gMFQ4i2hrDdNaUsXZLNkFzfgSmZwH+rGXd+As6mR7J495KNRXEuX4l6zBuPxAJDKptg32EJHrINSXxnLSpYR9oSPeu098W5aBltG2mdkEUsiS2gbamP/4D6cxsnSkqXUBGsnHHcoGmd3xxC5vGVZdYglVaEZ+/0cqT/ZT8tgM4OpQRpCDSwuWYLbsTDa6d7oXoYyg3TEOvA43ZT7KvE5fcQzcVL5JEvTpWTScfbnuzEOJ02pIKWHBrFBH5n2drwrVxHacAZDqTy7OgbpHU5RXxZgZU0In2cufrSQYra3p4PBTCc9qS5CrgBup5vuRA/l/gqCriDt8UNYm6cmUIvH4aV5cA8VvkpCrhJ8Tj996S66El2UecuJuCO0x9op9ZUST6UIucro6yvBgYNSDzRVBWlPDINJkHEM0pfswWM8LEmFKG/tJVldQlsgRdSdpsRXitu4aQg3EIsF2N05SJ4MdeEcTQcPYNoO4SwpxQKZoIf2xhCR8joWRRZNep1tg600D7YQz8RoDC9iTfkanA4n/ckozQPNDKWHaAg3siSyBJdj5v7OOvoT7O4YIpXNs7w6xNLqkdfkvM2zf2AfB4YPEHAFWDrP7seb+5qJpvvoTnQTcAUo9ZaSzWdoj7cT8ZRQ6a+iL9lLMpuiLtBEPhOhrKcZX/NBcDlxL15Ccl8zJpvDuWoVw40V7B/ah8/lY2nJMir9ldMS576BfRwYap32ekWOZc69s1tr24H20f8PGWMeBxqAI5OG1wC3WGstcK8xptQYUzd6rMiC9kTPE+zsf4xvbr+ZPHkAzqjaTMQb4S9tfwbgFae9ijv2/Za+ZC+rylbz8pWvZfe+DDuyd/L3Q3eN1fWqNdfy6+ZfUROo5qWrXkbDdf9OvreXyDvewB8ur2ZDzUV88dFPMpAeAKDUW8abNryV/7f1EwBsrDydhlADv933GwBevOql/LXtL3TEOwDwu/y8as21gOXr224CYFXpKv7l7A9QHaiejadLZlnXYIKHOrdz085P87SGi/hbew/bex8FwGEcXLf2tfQmeni873HOr7+AofQQ3338O8QywwBU+au4fPEV/GDn9wBYV7GO6kANfz/0d9644Xq++siXydkcABvKN/DGPxu8Dz1I9jXX8UDrMDf+5Aly+ZF5js9ZEuGG3X/C9/e7+PMNn+ILjzw4Fufzzq7glZl9ZN/6LrKjZcFrX8yPL3OxJ3OQj5//SRaXLBl3bW1DbXzsnhvpiHfwklUv5Zs7vkFnvBOAgCvAxy/4FCvLVk7+vMQ7+cz9n2JP/x4A3A43Hzv/E6yrXH/U5/LuQ3/nsw/859jPlzVdzus3vIHQMRI/p2owNcg3tn+dPx/401jZe858Lxc3XTJj5ywWyXSW7969jx/du58Pv8LHv933/8jbkdfYdRXref6KF/LvD3yG4cwwfpefV655NTdv/98j9llHTaCWhlADDhxsb7+DB7oeAMBgePsZ7+CXe29l/+B+AM6vu4AL6i7i0w9+eiyG92z+AN/f+W3a44eAkdfQj4Reiuf6d4/UE4nge+cNDHzs4wAE3/B60vf/g8yjj4IxlN/0NfxXPxNrLX9p+zNfevgLY3U/c+nVXLvmNQQ9wQnX3hHr4NP3f4KWgRYAPA4P/3rev/Hp+z819rdZ7a/m387/GI1HJMf39wzzru8+wKFoEoCQz8UXr9vCmvqSU/11HNNAaoCvPvoV7j70N2Dk+X3fWR/g/IYLZvS8xWBv7x72DbfwxYe+MHYPsL5yA1cuuhKP00vbcBvbUtvoTfRwT/vdAJR6S/lg7BIq/ufXuFavZvA9HyT1059yS0+QH9/XOlb3+56zlms2N+JwHP3LDJGTsauji+b4Nn6+54d4XV5OrzqD2/beCsBr1r2Wn+36KUOZIQDKfRW8fv0biWfifO3RT/DW09/GcGaYbz/2rbH6nrHoCpwOF3c88hteueZVfHvn17mq4Vq+cbuD81dWcukaJ3FHN0nXPn6y68cMpQfH6v5Q+Sv45cCd3NVyHzDyuvHKNa/mB098n5XOl/K/v+8FoKHUxyf8+4l8/EMABK+7ltyDDxG59Fy+tmWQ6zZdz6ryVeOuc9/APr726FfY0bsdGLnf+ZezPsDa8jX8z8Nf4r6Oe0fKcfCBcz7EOXXnzsjz3dYX473ff4iW7hgAfo+TL163hfWNpWzv2ca/3f2vZO3IXc/a8nW8Z8t7qQpUzUgss2lPXzOP9DzItx/7JjDymvfiVS/lf7fdhB1d++K8uvO5tOly7tj3c9K5DNed9maG2nPw1pH310xTE4EXPJ+h//4CJhzG/52v8dWuLzOUGWJF6Uref/YHqA7UnFKcj/Xu4F///mHS+TQAy0tW8P6zPzDpF3Ii02nODU8+kjFmCXAGcN9TNjUAB474uW20TGTB605284s9vxj7sADwUPeDNIYbx37+5d5fcFHjxQDsij7BgdgezlqTG5cwBPjlnpH9nog+QctgC6mPjMwW4PrSzZwdOId72v82ljAE6E9FebzvMSr9IzcY6ys3jCUM/S4/2XxmLGEIkMgmeLj7IQ4NH2JN+dqRePp3sTu6azqfEikiB6K9/Kbt+2TzWeqC9WMJQxj5lvtXzbeRyCXYUnsWt+29FYsdS0oAdCe6GU4PEfaM9Kjb0buD+lA9F9RfwM93/3QsYQiwrW8bhy5aQ/4/v8RQRx//87t9YwlDgPv2DdK6+Wn
|
||
|
"text/plain": [
|
||
|
"<Figure size 1302.38x1260 with 56 Axes>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {
|
||
|
"needs_background": "light"
|
||
|
},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"sns.pairplot(df,hue='Region',palette='Set1')"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"## DBSCAN\n",
|
||
|
"\n",
|
||
|
"**TASK: Since the values of the features are in different orders of magnitude, let's scale the data. Use StandardScaler to scale the data.**"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 87,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"#CODE HERE"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 89,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"from sklearn.preprocessing import StandardScaler\n",
|
||
|
"scaler = StandardScaler()\n",
|
||
|
"scaled_X = scaler.fit_transform(df)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 90,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"array([[ 1.44865163, 0.59066829, 0.05293319, ..., -0.58936716,\n",
|
||
|
" -0.04356873, -0.06633906],\n",
|
||
|
" [ 1.44865163, 0.59066829, -0.39130197, ..., -0.27013618,\n",
|
||
|
" 0.08640684, 0.08915105],\n",
|
||
|
" [ 1.44865163, 0.59066829, -0.44702926, ..., -0.13753572,\n",
|
||
|
" 0.13323164, 2.24329255],\n",
|
||
|
" ...,\n",
|
||
|
" [ 1.44865163, 0.59066829, 0.20032554, ..., -0.54337975,\n",
|
||
|
" 2.51121768, 0.12145607],\n",
|
||
|
" [-0.69029709, 0.59066829, -0.13538389, ..., -0.41944059,\n",
|
||
|
" -0.56977032, 0.21304614],\n",
|
||
|
" [-0.69029709, 0.59066829, -0.72930698, ..., -0.62009417,\n",
|
||
|
" -0.50488752, -0.52286938]])"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 90,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"scaled_X"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"**TASK: Use DBSCAN and a for loop to create a variety of models testing different epsilon values. Set min_samples equal to 2 times the number of features. During the loop, keep track of and log the percentage of points that are outliers. For reference the solutions notebooks uses the following range of epsilon values for testing:**\n",
|
||
|
"\n",
|
||
|
" np.linspace(0.001,3,50)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 95,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"#CODE HERE"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 96,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"from sklearn.cluster import DBSCAN"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 97,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"outlier_percent = []\n",
|
||
|
"\n",
|
||
|
"for eps in np.linspace(0.001,3,50):\n",
|
||
|
" \n",
|
||
|
" # Create Model\n",
|
||
|
" dbscan = DBSCAN(eps=eps,min_samples=2*scaled_X.shape[1])\n",
|
||
|
" dbscan.fit(scaled_X)\n",
|
||
|
" \n",
|
||
|
" \n",
|
||
|
" # Log percentage of points that are outliers\n",
|
||
|
" perc_outliers = 100 * np.sum(dbscan.labels_ == -1) / len(dbscan.labels_)\n",
|
||
|
" \n",
|
||
|
" outlier_percent.append(perc_outliers)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"**TASK: Create a line plot of the percentage of outlier points versus the epsilon value choice.**"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 98,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"#CODE HERE"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 99,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"Text(0.5, 0, 'Epsilon Value')"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 99,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
},
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAEGCAYAAACKB4k+AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAs7klEQVR4nO3deXxU5b3H8c8vOyQhBJKwRyIgCMou7nWhrihWS622rrW1tr2lrXaxra21t3bR2l6XW5VWq7WtV6vWBa0bgmulsiubIIuACEGBsJPld/+YkxgxmRySTE4y832/Xuc1M2dmcr7HwfnNeZ5znsfcHREREYC0qAOIiEj7oaIgIiJ1VBRERKSOioKIiNRRURARkToZUQdoiaKiIu/fv3/UMUREOpTZs2dvcvfihp7r0EWhf//+zJo1K+oYIiIdipmtbuw5NR+JiEgdFQUREamjoiAiInVUFEREpE6TRcHMPmdm+cH9a8zsETMbnfhoIiLS1sIcKfzE3beZ2THAp4G7gNubepOZ3W1mG83srXrrupnZc2a2LLgtDNabmd1iZsvNbIGKjohINMIUhergdgIwxd2fBLJCvO8e4NR91l0NTHP3QcC04DHAacCgYLmcEEVHRERaX5jrFNaZ2Z3AScBvzCybEMXE3V8ys/77rD4LOD64fy8wA/hBsP4vHhvH+3Uz62pmvdx9fai92E9vrPqQl98uT8Sf/phD+hRw8rCeCd+OiEhrCVMUziX2i/+37r7FzHoB32vm9nrU+6J/H+gR3O8DrKn3urXBuk8UBTO7nNjRBKWlpc0KMWf1Zm6dvrxZ7w2rdpqKP140lpOG9oj/YhGRdiJuUTCzdGCOuw+pXRd8qbf4F7y7u5nt9ww/7j4FmAIwduzYZs0Q9NXjBvDV4wY0562h7a6s5tw7/82VD8zj0f86mgHFeQndnohIa4jbDOTu1cBSM2veT/JP2hAcaRDcbgzWrwP61Xtd32Bdh5WTmc7tF4whMyONK+6bzfY9VVFHEhFpUpiO5kJgoZlNM7PHa5dmbu9x4OLg/sXAY/XWXxSchXQEsDVR/QltqU/XTtx2/ijeKd/O9x+aj6Y+FZH2Lkyfwk+a84fN7H5incpFZrYWuBb4NfCgmV0GrCbWXwHwFHA6sBzYCVzanG22R0cNLOLq04bwy6eWcOdLK7giwc1WIiIt0WRRcPcXzewAYJC7P29mnYH0EO87v5GnxjfwWge+0dTf7Ki+cuyBzF+7lRueXsKw3l04dlCDI9aKiEQuzBXNXwEeAu4MVvUBHk1gpqRjZtzw2eEMKsln8v1zWfPhzqgjiYg0KEyfwjeAo4EKAHdfBpQkMlQyys3O4I4Lx1BV43ztb7Opqq6JOpKIyCeEKQp73H1v7QMzywDUY9oMZUW5XDdxGG+tq+CNVZujjiMi8glhisKLZvYjoJOZnQT8A3gisbGS18nDepKZbkxfurHpF4uItLEwReFqoBx4E/gqsTOFrklkqGSWl53B4WXdeWGJioKItD9hxjCqcfc/uvvn3H1ScF/NRy1wwpASlm/crg5nEWl3Gi0KZvZgcPtmMJz1x5a2i5h8ThwS66fX0YKItDfxrlP4VnB7RlsESSVlRbn0796Z6Us3cvFR/aOOIyJSp9GiUDvMhLuvbrs4qeOEISX8fea77NpbTaesJq8FFBFpE/Gaj7aZWUW924r6j9syZDI6cUgJe6pqeO2dTVFHERGp02hRcPd8d+9S77ZL/cdtGTIZjSvrRuesdPUriEi7EmaYi/vCrJP9k52RzjEDi5i+ZKNGTxWRdiPMdQrD6j8Irmgek5g4qeXEISW8t3U3SzdsizqKiAgQv0/hh2a2DRhevz8B2MBH8yBIC5ygU1NFpJ2J16fwK3fPB27cpz+hu7v/sA0zJq0eXXIY1rsLM5aURx1FRAQIN8nOv8zsU/uudPeXEpAn5ZwwuITbX3yHrTsrKeicGXUcEUlxYYrC9+rdzwHGAbOBExOSKMWcMKSE26Yv58Vl5Uwc0TvqOCKS4sKMfXRmveUk4BBA4z63kpH9utItN4vp6lcQkXYgzNlH+1oLHNzaQVJVeppx3EHFzFi6keoanZoqItFqsvnIzG7lo0l10oCRwJwEZko5Jwwp4Z9z1zFvzRbGHFAYdRwRSWFh+hRm1btfBdzv7q8mKE9KOm5QMelpxvQlG1UURCRSYZqPHiDWsTwbeFgFofUVdM5kTGmhrlcQkcjFu3gtw8xuINaHcC/wF2CNmd1gZjp3spUdP6SYResreH/r7qijiEgKi3ekcCPQDShz9zHuPhoYAHQFftsG2VLKyUN7Yga/+tdijYUkIpGJVxTOAL7i7nUD87h7BfA14PREB0s1A0vyuOqkg3hs3nvc/eqqqOOISIqKVxS8obmY3b2aj85Gklb09eMHcvLQHvzyqcW8vuKDqOOISAqKVxQWmdlF+640swuAJYmLlLrS0oybzh3BAd07819/n8P6rbuijiQiKSZeUfgG8A0zm2FmNwXLi8BkYk1IkgD5OZlMuXAMu/ZWc8Vf57CnqjrqSCKSQuKNkrrO3Q8Hfg6sCpafu/s4d1/XNvFS08CSfG46dwTz12zhZ48vjDqOiKSQJi9ec/cXgBfaIIvUc+ohvfj68QP4w4x3GN63K+ePK406koikgOaMfSRt5KqTB3PsoCKufWwh89dsiTqOiKQAFYV2LD3NuOW8UXTPy+LHj75JjQbME5EEa7IomFmumaUF9w8ys4ktvaLZzL5jZgvN7C0zu9/McsyszMxmmtlyM3vAzLJaso1kUZibxQ9OHcJb6yr451x15YhIYoU5UngJyDGzPsCzwIXAPc3dYPB3JgNj3f0QIB04D/gN8Ht3H0hsvobLmruNZDNxRG+G9y3gxmeWsmuvzkYSkcQJUxTM3XcC5wB/cPfPAcNauN0MoJOZZQCdgfXEZnJ7KHj+XuAzLdxG0khLM66ZMJT3K3bzx5dXRB1HRJJYqKJgZkcCXwSeDNalN3eDwemsvwXeJVYMthIbgXWLu1cFL1sL9GkkzOVmNsvMZpWXp86E9+PKunHqsJ7c8eI7bKzQoHkikhhhisK3gB8C/3T3hWZ2IDC9uRs0s0LgLKAM6A3kAqeGfb+7T3H3se4+tri4uLkxOqSrTxtCZXUNNz37dtRRRCRJhZmj+SV3n+juvwker3D3yS3Y5qeBle5e7u6VwCPA0UDXoDkJoC+gXtV99C/K5aIj+/Pg7DUsXl8RdRwRSUJhzj4qNrMbzewpM3uhdmnBNt8FjjCzzmZmwHhgEbGjj0nBay4GHmvBNpLW5BMHUdApk+uf1BDbItL6wjQf/Y3YAHhlwHXEhrt4o7kbdPeZxDqU5wBvBhmmAD8ArjSz5UB34K7mbiOZFXTOZPKJg3hl+SZmLE2dPhURaRvW1K9NM5vt7mPMbIG7Dw/WveHuh7VJwjjGjh3rs2bNavqFSWZvVQ2n/M9LpKcZT3/rWDLSdQ2iiIQXfK+Pbei5MN8mlcHtejObYGajiM3IJhHJykjj6tOGsHzjdv42892o44hIEmlyQDzgF2ZWAFwF3Ap0Ab6T0FTSpJOH9uCYgUX8fOoiOmWmc+5h/aKOJCJJIMwoqVODu1uBExIbR8IyM+64cAxf++tsvv/wAt6v2M03TxxIrO9eRKR51BjdgeVlZ3DXxYdxzqg+/O65t/nxo29RVV0TdSwR6cDCNB9JO5aVkcZN546gR0EOt894h40Ve7j1/FF0ymr2ReciksJ0pJAEzIwfnDqE6yYOY9qSDXzxT6+zecfeqGOJSAfU6JGCmV0Z743u/rvWjyMtcfFR/SnJz+ZbD8xj0h2v8X+XH0lxfnbUsUSkA4l3pJAfLGOBrxEboK4PcAUwOvHRpDlOO7QXf/nSON7bspsL75qpIwYR2S+NFgV3v87dryM2DtFod7/K3a8CxgCaMLgdO+LA7vzxorGs2LSDi+7+DxW7K5t+k4gI4foUegD1f27uDdZJO3bMoCLuuGA0S96v4NI/v8GOPVVNv0lEUl6YovAX4D9m9jMz+xkwk9gkONLOnTikB7ecN4q5727my/fOYnelZm0Tkfj
|
||
|
"text/plain": [
|
||
|
"<Figure size 432x288 with 1 Axes>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {
|
||
|
"needs_background": "light"
|
||
|
},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"sns.lineplot(x=np.linspace(0.001,3,50),y=outlier_percent)\n",
|
||
|
"plt.ylabel(\"Percentage of Points Classified as Outliers\")\n",
|
||
|
"plt.xlabel(\"Epsilon Value\")"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"## DBSCAN with Chosen Epsilon\n",
|
||
|
"\n",
|
||
|
"**TASK: Based on the plot created in the previous task, retrain a DBSCAN model with a reasonable epsilon value. Note: For reference, the solutions use eps=2.**"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 102,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"DBSCAN(eps=2)"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 102,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"dbscan = DBSCAN(eps=2)\n",
|
||
|
"dbscan.fit(scaled_X)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"**TASK: Create a scatterplot of Milk vs Grocery, colored by the discovered labels of the DBSCAN model.**"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 127,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"#CODE HERE"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 128,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"<AxesSubplot:xlabel='Grocery', ylabel='Milk'>"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 128,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
},
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAZEAAAEGCAYAAACkQqisAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAABNgUlEQVR4nO3dd3gc1fXw8e/ZolXvvdiSbLnbuOEGGDDFYAiGhFBScOgJ/RcSShoh5E1ITyCEhAQSSKihmo5pAQO2kY17lYtsyeq9b7vvHzuSJUtuslayrPN5nn20c+fOzN31es/OrWKMQSmllOoN20AXQCml1OClQUQppVSvaRBRSinVaxpElFJK9ZoGEaWUUr3mGOgC9LfExESTnZ090MVQSqlBY+XKlZXGmKSe9g25IJKdnU1+fv5AF0MppQYNESk80D6tzlJKKdVrGkSUUkr1mgYRpZRSvTbk2kSUUqo/eDweioqKaG1tHeiiHLbQ0FAyMzNxOp2HfYwGEaWUCoKioiKioqLIzs5GRAa6OIdkjKGqqoqioiJycnIO+zitzlLHBV9bK+7aatqqK/E2N6ETi6qB1traSkJCwqAIIAAiQkJCwhHfOemdiBr0vG2tNO7Yit/jDiSIEJU7CmdE1MAWTA15gyWAtOtNefVORA163qaGfQEEwBhaykowPt/AFUqpIUKDiBr0/G53D2ltWqWlVCebN29m9uzZuFwufvvb3/bZebU6Sw16zogoWinpkuaKT8Tm0I+3Uu3i4+N54IEHePnll/v0vPq/TA16jvAIIrJyaC4pwvh8hCYmExIbP9DFUuqItNVU0VJajN/jxuYMISw1A1dcQp+dPzk5meTkZF5//fU+OydoEFHHAbHbccUl4IyMxhg/NmfIoGvQVENbW00VTUWFYPwA+D3uwDb0aSAJBm0TUccNm9OJPcSlAUQNOi2lxR0BpIPxB9KPcRpElFJqgHXpXXgY6YfroYceYvLkyUyePJm9e/ce1bkOJGhBRERGi8jqTo96EblNROJFZImIbLP+xln5RUQeEJECEVkrIlM7nWuRlX+biCzqlD5NRNZZxzwg+hNUKTUI2ZwhR5R+uG688UZWr17N6tWrSU9PP6pzHUjQgogxZosxZrIxZjIwDWgGXgLuAt4zxuQB71nbAOcCedbjOuBhABGJB+4BZgIzgHvaA4+V59pOx50TrNejlFLBEpaaAbLf17HYAul9pLS0lMzMTH7/+9/z85//nMzMTOrr64/6vP3VsH4GsN0YUygiC4HTrPTHgQ+BO4GFwBMm0Ll/mYjEikialXeJMaYaQESWAOeIyIdAtDFmmZX+BHAh8GY/vSallOoT7Y3nweydlZqaSlFRUZ+dr11/BZHLgKet5ynGmPZO/aVAivU8A9jT6ZgiK+1g6UU9pCul1KDjiks45nti9SToDesiEgJcAPx3/33WXUfQhxWLyHUiki8i+RUVFcG+nFJKDRn90TvrXGCVMabM2i6zqqmw/pZb6cVAVqfjMq20g6Vn9pDejTHmEWPMdGPM9KSkHteaV0op1Qv9EUQuZ19VFsBioL2H1SLglU7pV1i9tGYBdVa119vA2SISZzWonw28be2rF5FZVq+sKzqdSymlVD8IapuIiEQAZwHXd0q+H3hORK4GCoFLrPQ3gAVAAYGeXFcCGGOqReQ+4HMr38/aG9mBG4B/AWEEGtS1UV0ppfpRUIOIMaYJSNgvrYpAb6398xrgxgOc5zHgsR7S84EJfVJYpZRSR0xHrCul1HHsrbfeYvTo0YwcOZL777+/z8+vQUQppY5TPp+PG2+8kTfffJONGzfy9NNPs3Hjxj69hs7iq5RSx4DC5ZtZ98onNFc3EB4fxcSFJzF85pijOueKFSsYOXIkubm5AFx22WW88sorjBs3ri+KDOidiFJKDbjC5ZvJf/JdmqsbAGiubiD/yXcpXL75qM5bXFxMVta+ERKZmZkUF/ftzMAaRJRSaoCte+UTfG5vlzSf28u6Vz4ZoBIdPg0iSik1wNrvQA43/XBlZGSwZ8++WaOKiorIyOjb2aE0iCil1AALj486ovTDdeKJJ7Jt2zZ27tyJ2+3mmWee4YILLjiqc+5Pg4hSSg2wiQtPwh7StZ+TPcTBxIUnHdV5HQ4Hf/7zn5k/fz5jx47lkksuYfz48Ud1zm7X6NOzKaWUOmLtvbD6uncWwIIFC1iwYMFRn+dANIgopdQxYPjMMX0SNPqbVmcppZTqNQ0iSimlek2DiFJKqV7TIKKUUqrXNIgopZTqNQ0iSil1nLrqqqtITk5mwoTgLbukQUQppY5T3/rWt3jrrbeCeg0NIkopdQx4/eUlzJ9zCSdkn8b8OZfw+stLjvqcc+fOJT4+vg9Kd2BBDSIiEisiz4vIZhHZJCKzRSReRJaIyDbrb5yVV0TkAREpEJG1IjK103kWWfm3iciiTunTRGSddcwDIiLBfD1KKRUMr7+8hHvv+g0lxWUYYygpLuPeu37TJ4Ek2IJ9J/In4C1jzBjgBGATcBfwnjEmD3jP2gY4F8izHtcBDwOISDxwDzATmAHc0x54rDzXdjrunCC/HqWU6nMP/PrvtLa0dUlrbWnjgV//fYBKdPiCFkREJAaYCzwKYIxxG2NqgYXA41a2x4ELrecLgSdMwDIgVkTSgPnAEmNMtTGmBlgCnGPtizbGLDPGGOCJTudSSqlBo3Rv+RGlH0uCeSeSA1QA/xSRL0TkHyISAaQYY0qsPKVAivU8A9jT6fgiK+1g6UU9pCul1KCSmp58ROnHkmAGEQcwFXjYGDMFaGJf1RUA1h2ECWIZABCR60QkX0TyKyoqgn05pZQ6IrfccS2hYa4uaaFhLm6549qjOu/ll1/O7Nmz2bJlC5mZmTz66KNHdb6eBHMW3yKgyBiz3Np+nkAQKRORNGNMiVUl1X6/VgxkdTo+00orBk7bL/1DKz2zh/zdGGMeAR4BmD59etCDllJKHYnzLjwLCLSNlO4tJzU9mVvuuLYjvbeefvrpvijeQQUtiBhjSkVkj4iMNsZsAc4ANlqPRcD91t9XrEMWAzeJyDMEGtHrrEDzNvCLTo3pZwN3G2OqRaReRGYBy4ErgAeD9XqUUiqYzrvwrKMOGgMh2OuJ3Aw8KSIhwA7gSgJVaM+JyNVAIXCJlfcNYAFQADRbebGCxX3A51a+nxljqq3nNwD/AsKAN62HUkqpfhLUIGKMWQ1M72HXGT3kNcCNBzjPY8BjPaTnA8Ebz6+UUkfBGMNgGr4W+Bo+MjpiXSmlgiA0NJSqqqpefTEPBGMMVVVVhIaGHtFxujyuUkoFQWZmJkVFRQymHqGhoaFkZmYeOmMnGkSUUioInE4nOTk5A12MoNPqLKWUUr2mQUQppVSvaRBRSinVaxpElFJK9Zo2rCs1BBlj8DY34a6twvh8hMQl4IyIRGz2gS6aGmQ0iCg1BHlbmmjYsQWsMQzu2mois0cSEh07sAVTg45WZyk1BHnq6zoCSLvWilKM3zdAJVKDlQYRpYaiHkZRG2P6YWEGdbzRIKLUEOTsodoqLCkVsWubiDoy2iai1BDkCI8gKnc0bVUV+P1eQhOScUREDnSx1CCkQUSpIUhEcEZG4YyMGnQzzapji1ZnKTXEaQBRR0ODiFJKqV7TIKKUUqrXNIgopZTqtaAGERHZJSLrRGS1iORbafEiskREtll/46x0EZEHRKRARNaKyNRO51lk5d8mIos6pU+zzl9gHauVu0op1Y/6407kdGPMZGNM+1rrdwHvGWPygPesbYBzgTzrcR3wMASCDnAPMBOYAdzTHnisPNd2Ou6c4L8cpZRS7QaiOmsh8Lj1/HHgwk7pT5iAZUCsiKQB84ElxphqY0wNsAQ4x9oXbYxZZgKLGD/R6VxKKaX6QbCDiAHeEZGVInKdlZZijCmxnpcCKdbzDGBPp2OLrLSDpRf1kN6NiFwnIvkikj+Y1jtWSqljXbAHG55sjCkWkWRgiYhs7rzTGGNEJOiz9RhjHgEeAZg+fbrODqSUUn0kqHcixphi62858BKBNo0yqyoK62+5lb0YyOp0eKaVdrD0zB7SlVJK9ZOgBRE
|
||
|
"text/plain": [
|
||
|
"<Figure size 432x288 with 1 Axes>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {
|
||
|
"needs_background": "light"
|
||
|
},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"sns.scatterplot(data=df,x='Grocery',y='Milk',hue=dbscan.labels_)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"**TASK: Create a scatterplot of Milk vs. Detergents Paper colored by the labels.**"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 133,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"#CODE HERE"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 134,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"<AxesSubplot:xlabel='Detergents_Paper', ylabel='Milk'>"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 134,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
},
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAZEAAAEHCAYAAABvHnsJAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAABROElEQVR4nO3dd3ydZdnA8d91Zvbeq+lI96KDtlDKKIUypICA4KAq2vcVcOFCfRUFVNxMRRAEFBmyQaCUPVto6Z5Jd/beOUnOOff7x3mSJs1smpOk7fX9fPLJOfezrnOSnCvPPcUYg1JKKTUQtuEOQCml1LFLk4hSSqkB0ySilFJqwDSJKKWUGjBNIkoppQbMMdwBDLWEhASTnZ093GEopdQxY926deXGmMTutp1wSSQ7O5u1a9cOdxhKKXXMEJH9PW3T6iyllFIDpklEKaXUgGkSUUopNWAnXJuIUkoNhdbWVvLz8/F4PMMdSr+FhISQkZGB0+ns9zGaRJRSKgjy8/OJjIwkOzsbERnucPpkjKGiooL8/HxGjx7d7+O0OkupYeD3eWmtr6O5uhJvYwPG7x/ukNQg83g8xMfHHxMJBEBEiI+PP+I7J70TUWqI+X0+mkqKaC4vaS8Lz8zGHZswjFGpYDhWEkibgcSrdyJKDTGfp6lTAgFoKDiAr7l5mCJSauA0iSg1xIzP27XQ7+++XKlBsmPHDhYsWIDb7eYPf/jDoJ1Xq7OUGmJ2lxtEoMOCcDaXG5vTNYxRqeNdXFwcd955J88999ygnlfvRJQaYjZ3CBHZ4xCrG6U9JJSIrDHYjqBbpTr+NFdVUL19E5Wb1lK9fRPNVRWDev6kpCTmzp17RN13+0PvRJQaYiKCKzIax7hJ+H0+bA4nNof+KZ7ImqsqaMjfDybQS8/f2hJ4Drhj44cztD7pnYhSw8TmdOEICdUEomgqLmhPIO2MP1A+wmkSUUqpYeZvbTmi8v665557mDlzJjNnzqSwsPCoztWToCUREZkgIhs6fNWKyHdEJE5EVolIrvU91tpfROROEckTkU0iMqvDuZZb++eKyPIO5bNFZLN1zJ1yrHXKVkop6LFTxdF2trjuuuvYsGEDGzZsIC0t7ajO1ZOgJRFjzE5jzExjzExgNtAIPAvcCLxhjMkB3rCeA5wH5FhfK4C/AohIHHATMA84GbipLfFY+3y9w3FLg/V6lFIqWEJT0kEO+zgWW6B8kBQXF5ORkcGf/vQnbr31VjIyMqitrT3q8w5VZexiYLcxZr+ILAPOsMofBt4GfgQsAx4xxhhgtYjEiEiqte8qY0wlgIisApaKyNtAlDFmtVX+CHAx8MoQvSallBoUbY3nTcUF+FtbsDldhKakD2qjekpKCvn5+YN2vjZDlUSuBB6zHicbY4qsx8VAsvU4HTjY4Zh8q6y38vxuypVS6pjjjo0f8T2xuhP0hnURcQEXAf85fJt112G6HDT4MawQkbUisrasrCzYl1NKqRPGUPTOOg/41BjTNllQiVVNhfW91CovADI7HJdhlfVWntFNeRfGmPuMMXOMMXMSE7tda14ppdQADEUSuYpDVVkALwBtPayWA893KL/a6qU1H6ixqr1WAueISKzVoH4OsNLaVisi861eWVd3OJdSSqkhENQ2EREJB5YA/9Oh+DbgSRG5BtgPXGGVvwycD+QR6Mn1FQBjTKWI3AJ8Yu13c1sjO3At8BAQSqBBXRvVlVJqCAU1iRhjGoD4w8oqCPTWOnxfA1zXw3keBB7spnwtMHVQglVKKXXEdMS6Ukodx1599VUmTJjAuHHjuO222wb9/JpElFLqOOXz+bjuuut45ZVX2LZtG4899hjbtm0b1GvozG9KKTUC7F+zg83Pf0BjZR1hcZFMW3Yqo+ZNPKpzfvzxx4wbN44xY8YAcOWVV/L8888zefLkwQgZ0DsRpZQadvvX7GDto6/TWFkHQGNlHWsffZ39a3Yc1XkLCgrIzDw0QiIjI4OCgsGdGViTiFJKDbPNz3+Ar6Xz8si+Fi+bn/9gmCLqP00iSik1zNruQPpb3l/p6ekcPHho1qj8/HzS0wd3dihNIkopNczC4iKPqLy/5s6dS25uLnv37qWlpYXHH3+ciy666KjOeThNIkopNcymLTsVu6tzPye7y8G0Zace1XkdDgd333035557LpMmTeKKK65gypQpR3XOLtcY1LMppZQ6Ym29sAa7dxbA+eefz/nnn3/U5+mJJhGllBoBRs2bOChJY6hpdZZSSqkB0ySilFJqwDSJKKWUGjBNIkoppQZMk4hSSqkB0ySilFLHqa9+9askJSUxdWrwll3SJKKUUsepL3/5y7z66qtBvYYmEaWUGgH++9wqzj3lCmZkn8G5p1zBf59bddTnXLRoEXFxcYMQXc+CmkREJEZEnhKRHSKyXUQWiEiciKwSkVzre6y1r4jInSKSJyKbRGRWh/Mst/bPFZHlHcpni8hm65g7RUSC+XqUUioY/vvcKn554+8pKijBGENRQQm/vPH3g5JIgi3YdyJ3AK8aYyYCM4DtwI3AG8aYHOAN6znAeUCO9bUC+CuAiMQBNwHzgJOBm9oSj7XP1zsctzTIr0cppQbdnb+7H09Tc6cyT1Mzd/7u/mGKqP+ClkREJBpYBDwAYIxpMcZUA8uAh63dHgYuth4vAx4xAauBGBFJBc4FVhljKo0xVcAqYKm1LcoYs9oYY4BHOpxLKaWOGcWFpUdUPpIE805kNFAG/ENE1ovI30UkHEg2xhRZ+xQDydbjdOBgh+PzrbLeyvO7KVdKqWNKSlrSEZWPJMFMIg5gFvBXY8xJQAOHqq4AsO4gTBBjAEBEVojIWhFZW1ZWFuzLKaXUEfnWD79OSKi7U1lIqJtv/fDrR3Xeq666igULFrBz504yMjJ44IEHjup83QnmLL75QL4xZo31/CkCSaRERFKNMUVWlVTb/VoBkNnh+AyrrAA447Dyt63yjG7278IYcx9wH8CcOXOCnrSUUupIXHDxEiDQNlJcWEpKWhLf+uHX28sH6rHHHhuM8HoVtCRijCkWkYMiMsEYsxNYDGyzvpYDt1nfn7cOeQG4XkQeJ9CIXmMlmpXArzs0pp8D/NgYUykitSIyH1gDXA3cFazXo5RSwXTBxUuOOmkMh2CvJ/JN4FERcQF7gK8QqEJ7UkSuAfYDV1j7vgycD+QBjda+WMniFuATa7+bjTGV1uNrgYeAUOAV60sppdQQCWoSMcZsAOZ0s2lxN/sa4LoezvMg8GA35WuB4I3nV0qpo2CM4Vgavhb4GD4yOmJdKaWCICQkhIqKigF9MA8HYwwVFRWEhIQc0XG6PK5SSgVBRkYG+fn5HEs9QkNCQsjIyOh7xw40iSilVBA4nU5Gjx493GEEnVZnKaWUGjBNIkoppQZMk4hSSqkB0ySilFJqwLRhXakTiDEGb1MD3vo6AJwRUdhDw46psQxqZNEkotQJxNvYQN2enWCNXWiSQiLHTMAZHjHMkaljlVZnKXUCaa4obU8gABhDS3Vlzwco1QdNIkqdIIwx+L3eruW+rmVK9ZcmEaVOECJCSHxil3JXbPwwRKOOF9omotQJxBEeSXjWGDylRSBCaFIqzjBtD1EDp0lEqROIzeHAHROHMzI68NxuH+aI1LFOk4hSJyBNHmqwaJuIUkqpAdMkopRSasA0iSillBqwoCYREdknIptFZIOIrLXK4kRklYjkWt9jrXIRkTtFJE9ENonIrA7nWW7tnysiyzuUz7bOn2cdq3M3KKXUEBqKO5EzjTEzjTFta63fCLxhjMkB3rCeA5wH5FhfK4C/QiDpADcB84CTgZvaEo+1z9c7HLc0+C9HKaVUm+GozloGPGw9fhi4uEP5IyZgNRAjIqnAucAqY0ylMaYKWAUstbZFGWNWm8Aixo90OJdSSqkhEOwkYoDXRGSdiKywypKNMUXW42Ig2XqcDhzscGy+VdZbeX435UoppYZIsMeJLDTGFIhIErBKRHZ03GiMMSJiejh20FgJbAVAVlZWsC+nlFInjKDeiRhjCqzvpcCzBNo0SqyqKKzvpdbuBUBmh8MzrLLeyjO6Ke8ujvu
|
||
|
"text/plain": [
|
||
|
"<Figure size 432x288 with 1 Axes>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {
|
||
|
"needs_background": "light"
|
||
|
},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"sns.scatterplot(data=df,x='Detergents_Paper',y='Milk',hue=dbscan.labels_)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"**TASK: Create a new column on the original dataframe called \"Labels\" consisting of the DBSCAN labels.**"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 106,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"#CODE HERE"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 107,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"df['Labels'] = dbscan.labels_"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 108,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/html": [
|
||
|
"<div>\n",
|
||
|
"<style scoped>\n",
|
||
|
" .dataframe tbody tr th:only-of-type {\n",
|
||
|
" vertical-align: middle;\n",
|
||
|
" }\n",
|
||
|
"\n",
|
||
|
" .dataframe tbody tr th {\n",
|
||
|
" vertical-align: top;\n",
|
||
|
" }\n",
|
||
|
"\n",
|
||
|
" .dataframe thead th {\n",
|
||
|
" text-align: right;\n",
|
||
|
" }\n",
|
||
|
"</style>\n",
|
||
|
"<table border=\"1\" class=\"dataframe\">\n",
|
||
|
" <thead>\n",
|
||
|
" <tr style=\"text-align: right;\">\n",
|
||
|
" <th></th>\n",
|
||
|
" <th>Channel</th>\n",
|
||
|
" <th>Region</th>\n",
|
||
|
" <th>Fresh</th>\n",
|
||
|
" <th>Milk</th>\n",
|
||
|
" <th>Grocery</th>\n",
|
||
|
" <th>Frozen</th>\n",
|
||
|
" <th>Detergents_Paper</th>\n",
|
||
|
" <th>Delicassen</th>\n",
|
||
|
" <th>Labels</th>\n",
|
||
|
" </tr>\n",
|
||
|
" </thead>\n",
|
||
|
" <tbody>\n",
|
||
|
" <tr>\n",
|
||
|
" <th>0</th>\n",
|
||
|
" <td>2</td>\n",
|
||
|
" <td>3</td>\n",
|
||
|
" <td>12669</td>\n",
|
||
|
" <td>9656</td>\n",
|
||
|
" <td>7561</td>\n",
|
||
|
" <td>214</td>\n",
|
||
|
" <td>2674</td>\n",
|
||
|
" <td>1338</td>\n",
|
||
|
" <td>0</td>\n",
|
||
|
" </tr>\n",
|
||
|
" <tr>\n",
|
||
|
" <th>1</th>\n",
|
||
|
" <td>2</td>\n",
|
||
|
" <td>3</td>\n",
|
||
|
" <td>7057</td>\n",
|
||
|
" <td>9810</td>\n",
|
||
|
" <td>9568</td>\n",
|
||
|
" <td>1762</td>\n",
|
||
|
" <td>3293</td>\n",
|
||
|
" <td>1776</td>\n",
|
||
|
" <td>0</td>\n",
|
||
|
" </tr>\n",
|
||
|
" <tr>\n",
|
||
|
" <th>2</th>\n",
|
||
|
" <td>2</td>\n",
|
||
|
" <td>3</td>\n",
|
||
|
" <td>6353</td>\n",
|
||
|
" <td>8808</td>\n",
|
||
|
" <td>7684</td>\n",
|
||
|
" <td>2405</td>\n",
|
||
|
" <td>3516</td>\n",
|
||
|
" <td>7844</td>\n",
|
||
|
" <td>0</td>\n",
|
||
|
" </tr>\n",
|
||
|
" <tr>\n",
|
||
|
" <th>3</th>\n",
|
||
|
" <td>1</td>\n",
|
||
|
" <td>3</td>\n",
|
||
|
" <td>13265</td>\n",
|
||
|
" <td>1196</td>\n",
|
||
|
" <td>4221</td>\n",
|
||
|
" <td>6404</td>\n",
|
||
|
" <td>507</td>\n",
|
||
|
" <td>1788</td>\n",
|
||
|
" <td>1</td>\n",
|
||
|
" </tr>\n",
|
||
|
" <tr>\n",
|
||
|
" <th>4</th>\n",
|
||
|
" <td>2</td>\n",
|
||
|
" <td>3</td>\n",
|
||
|
" <td>22615</td>\n",
|
||
|
" <td>5410</td>\n",
|
||
|
" <td>7198</td>\n",
|
||
|
" <td>3915</td>\n",
|
||
|
" <td>1777</td>\n",
|
||
|
" <td>5185</td>\n",
|
||
|
" <td>0</td>\n",
|
||
|
" </tr>\n",
|
||
|
" </tbody>\n",
|
||
|
"</table>\n",
|
||
|
"</div>"
|
||
|
],
|
||
|
"text/plain": [
|
||
|
" Channel Region Fresh Milk Grocery Frozen Detergents_Paper \\\n",
|
||
|
"0 2 3 12669 9656 7561 214 2674 \n",
|
||
|
"1 2 3 7057 9810 9568 1762 3293 \n",
|
||
|
"2 2 3 6353 8808 7684 2405 3516 \n",
|
||
|
"3 1 3 13265 1196 4221 6404 507 \n",
|
||
|
"4 2 3 22615 5410 7198 3915 1777 \n",
|
||
|
"\n",
|
||
|
" Delicassen Labels \n",
|
||
|
"0 1338 0 \n",
|
||
|
"1 1776 0 \n",
|
||
|
"2 7844 0 \n",
|
||
|
"3 1788 1 \n",
|
||
|
"4 5185 0 "
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 108,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"df.head()"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"**TASK: Compare the statistical mean of the clusters and outliers for the spending amounts on the categories.**"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 109,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"# CODE HERE"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 114,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"cats = df.drop(['Channel','Region'],axis=1)\n",
|
||
|
"cat_means = cats.groupby('Labels').mean()"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 115,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/html": [
|
||
|
"<div>\n",
|
||
|
"<style scoped>\n",
|
||
|
" .dataframe tbody tr th:only-of-type {\n",
|
||
|
" vertical-align: middle;\n",
|
||
|
" }\n",
|
||
|
"\n",
|
||
|
" .dataframe tbody tr th {\n",
|
||
|
" vertical-align: top;\n",
|
||
|
" }\n",
|
||
|
"\n",
|
||
|
" .dataframe thead th {\n",
|
||
|
" text-align: right;\n",
|
||
|
" }\n",
|
||
|
"</style>\n",
|
||
|
"<table border=\"1\" class=\"dataframe\">\n",
|
||
|
" <thead>\n",
|
||
|
" <tr style=\"text-align: right;\">\n",
|
||
|
" <th></th>\n",
|
||
|
" <th>Fresh</th>\n",
|
||
|
" <th>Milk</th>\n",
|
||
|
" <th>Grocery</th>\n",
|
||
|
" <th>Frozen</th>\n",
|
||
|
" <th>Detergents_Paper</th>\n",
|
||
|
" <th>Delicassen</th>\n",
|
||
|
" </tr>\n",
|
||
|
" <tr>\n",
|
||
|
" <th>Labels</th>\n",
|
||
|
" <th></th>\n",
|
||
|
" <th></th>\n",
|
||
|
" <th></th>\n",
|
||
|
" <th></th>\n",
|
||
|
" <th></th>\n",
|
||
|
" <th></th>\n",
|
||
|
" </tr>\n",
|
||
|
" </thead>\n",
|
||
|
" <tbody>\n",
|
||
|
" <tr>\n",
|
||
|
" <th>-1</th>\n",
|
||
|
" <td>30161.529412</td>\n",
|
||
|
" <td>26872.411765</td>\n",
|
||
|
" <td>33575.823529</td>\n",
|
||
|
" <td>12380.235294</td>\n",
|
||
|
" <td>14612.294118</td>\n",
|
||
|
" <td>8185.411765</td>\n",
|
||
|
" </tr>\n",
|
||
|
" <tr>\n",
|
||
|
" <th>0</th>\n",
|
||
|
" <td>8200.681818</td>\n",
|
||
|
" <td>8849.446970</td>\n",
|
||
|
" <td>13919.113636</td>\n",
|
||
|
" <td>1527.174242</td>\n",
|
||
|
" <td>6037.280303</td>\n",
|
||
|
" <td>1548.310606</td>\n",
|
||
|
" </tr>\n",
|
||
|
" <tr>\n",
|
||
|
" <th>1</th>\n",
|
||
|
" <td>12662.869416</td>\n",
|
||
|
" <td>3180.065292</td>\n",
|
||
|
" <td>3747.250859</td>\n",
|
||
|
" <td>3228.862543</td>\n",
|
||
|
" <td>764.697595</td>\n",
|
||
|
" <td>1125.134021</td>\n",
|
||
|
" </tr>\n",
|
||
|
" </tbody>\n",
|
||
|
"</table>\n",
|
||
|
"</div>"
|
||
|
],
|
||
|
"text/plain": [
|
||
|
" Fresh Milk Grocery Frozen \\\n",
|
||
|
"Labels \n",
|
||
|
"-1 30161.529412 26872.411765 33575.823529 12380.235294 \n",
|
||
|
" 0 8200.681818 8849.446970 13919.113636 1527.174242 \n",
|
||
|
" 1 12662.869416 3180.065292 3747.250859 3228.862543 \n",
|
||
|
"\n",
|
||
|
" Detergents_Paper Delicassen \n",
|
||
|
"Labels \n",
|
||
|
"-1 14612.294118 8185.411765 \n",
|
||
|
" 0 6037.280303 1548.310606 \n",
|
||
|
" 1 764.697595 1125.134021 "
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 115,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"cat_means"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"**TASK: Normalize the dataframe from the previous task using MinMaxScaler so the spending means go from 0-1 and create a heatmap of the values.**"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 119,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"#CODE HERE"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 120,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"from sklearn.preprocessing import MinMaxScaler"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 121,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"scaler = MinMaxScaler()\n",
|
||
|
"data = scaler.fit_transform(cat_means)\n",
|
||
|
"scaled_means = pd.DataFrame(data,cat_means.index,cat_means.columns)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 122,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/html": [
|
||
|
"<div>\n",
|
||
|
"<style scoped>\n",
|
||
|
" .dataframe tbody tr th:only-of-type {\n",
|
||
|
" vertical-align: middle;\n",
|
||
|
" }\n",
|
||
|
"\n",
|
||
|
" .dataframe tbody tr th {\n",
|
||
|
" vertical-align: top;\n",
|
||
|
" }\n",
|
||
|
"\n",
|
||
|
" .dataframe thead th {\n",
|
||
|
" text-align: right;\n",
|
||
|
" }\n",
|
||
|
"</style>\n",
|
||
|
"<table border=\"1\" class=\"dataframe\">\n",
|
||
|
" <thead>\n",
|
||
|
" <tr style=\"text-align: right;\">\n",
|
||
|
" <th></th>\n",
|
||
|
" <th>Fresh</th>\n",
|
||
|
" <th>Milk</th>\n",
|
||
|
" <th>Grocery</th>\n",
|
||
|
" <th>Frozen</th>\n",
|
||
|
" <th>Detergents_Paper</th>\n",
|
||
|
" <th>Delicassen</th>\n",
|
||
|
" </tr>\n",
|
||
|
" <tr>\n",
|
||
|
" <th>Labels</th>\n",
|
||
|
" <th></th>\n",
|
||
|
" <th></th>\n",
|
||
|
" <th></th>\n",
|
||
|
" <th></th>\n",
|
||
|
" <th></th>\n",
|
||
|
" <th></th>\n",
|
||
|
" </tr>\n",
|
||
|
" </thead>\n",
|
||
|
" <tbody>\n",
|
||
|
" <tr>\n",
|
||
|
" <th>-1</th>\n",
|
||
|
" <td>1.000000</td>\n",
|
||
|
" <td>1.000000</td>\n",
|
||
|
" <td>1.000000</td>\n",
|
||
|
" <td>1.000000</td>\n",
|
||
|
" <td>1.000000</td>\n",
|
||
|
" <td>1.000000</td>\n",
|
||
|
" </tr>\n",
|
||
|
" <tr>\n",
|
||
|
" <th>0</th>\n",
|
||
|
" <td>0.000000</td>\n",
|
||
|
" <td>0.239292</td>\n",
|
||
|
" <td>0.341011</td>\n",
|
||
|
" <td>0.000000</td>\n",
|
||
|
" <td>0.380758</td>\n",
|
||
|
" <td>0.059938</td>\n",
|
||
|
" </tr>\n",
|
||
|
" <tr>\n",
|
||
|
" <th>1</th>\n",
|
||
|
" <td>0.203188</td>\n",
|
||
|
" <td>0.000000</td>\n",
|
||
|
" <td>0.000000</td>\n",
|
||
|
" <td>0.156793</td>\n",
|
||
|
" <td>0.000000</td>\n",
|
||
|
" <td>0.000000</td>\n",
|
||
|
" </tr>\n",
|
||
|
" </tbody>\n",
|
||
|
"</table>\n",
|
||
|
"</div>"
|
||
|
],
|
||
|
"text/plain": [
|
||
|
" Fresh Milk Grocery Frozen Detergents_Paper Delicassen\n",
|
||
|
"Labels \n",
|
||
|
"-1 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000\n",
|
||
|
" 0 0.000000 0.239292 0.341011 0.000000 0.380758 0.059938\n",
|
||
|
" 1 0.203188 0.000000 0.000000 0.156793 0.000000 0.000000"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 122,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"scaled_means"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 123,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"<AxesSubplot:ylabel='Labels'>"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 123,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
},
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAWsAAAFLCAYAAAAQ6q1OAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAehUlEQVR4nO3deZwkVZnu8d/TDQiyKYILNCJiI+IKIsjgjLigIA64K8hVEWnvFbwqLoNeBUSvd9yYUQeVVlFRB8TlYgs9oCjbgEI3y4CAjC0qiyggu87QdNUzf0QUJDVVmVndmRlxqp4vn/h0RURW5Nv9Kd46ec57zpFtIiKi3eY1HUBERPSWZB0RUYAk64iIAiRZR0QUIMk6IqIASdYREQVIso6IGDBJx0u6WdIvprkvSZ+VtELS5ZJ27PXMJOuIiMH7GrBnl/t7AQvrYxHwhV4PTLKOiBgw2+cCt3V5yb7ACa78HHiYpMd0e+ZagwxwkO679dpMrYyIvqy96eO1ps+YSc5ZZ7Nt3krVIp6w2PbiGbzdFsD1Hec31Ndumu4bWpusIyLaqk7MM0nOayzJOiICYHxslO92I7Blx/mC+tq00mcdEQEwtqr/Y80tAd5QV4U8G7jT9rRdIJCWdUQEAPb4wJ4l6URgd2BTSTcARwJrV+/jLwJLgZcAK4C/AAf2emaSdUQEwPjgkrXt/XrcN3DITJ6ZZB0RATDAlvUwJFlHRMCoBxhnLMk6IgLSso6IKIEHU+UxNEnWEREw0AHGYUiyjoiAdINERBQhA4wREQVIyzoiogAZYIyIKEAGGCMi2s9On3VERPulzzoiogDpBomIKEBa1hERBRi7r+kIukqyjoiAdINERBQh3SAREQVIyzoiogBJ1hER7ecMMEZEFCB91hERBUg3SEREAdKyjogoQFrWEREFSMs6IqIAq7L5QERE+6VlHRFRgPRZR0QUIC3rB5O0ge17Rv2+ERFdtbxlPa+B97yqgfeMiOjO4/0fDRhKy1rSYdPdAjbo8n2LgEUAn//0R3nLG/YbQnQREVOYo9UgHwM+CUz1t5+2NW97MbAY4L5br/VwQouImILbnXKGlawvAU6xffHkG5LeMqT3jIhYfS3vsx5Wsj4Q+FPnBUmPtv0HYKchvWdExOprebIeygCj7Wts3zrp8tL63h+H8Z4REWtkgAOMkvaUdI2kFZIOn+L+YyWdJelSSZdLekmvZ46ydE8jfK+IiJkZGxvIYyTNB44F9gBuAJZJWmK7sxLug8DJtr8gaXuqxuzjuj13lMn6SyN8r4iImRlcN8jOwArb1wJIOgnYlweXLRvYqP56Y+D3vR46smRt+/Ojeq+IiBmbQbLuLDOuLa6r2QC2AK7vuHcDsMukRxwF/EjS24H1gRf2es9MN4+IgBlNduksM15N+wFfs/1pSbsC35D0FHv6IJKsIyIAjw+szvpGYMuO8wX1tU4HAXsC2P6ZpHWBTYGbp3toE9PNIyLaZ3y8/6O7ZcBCSVtLWgd4HbBk0muuA14AIOlJwLrALd0empZ1RAQMrBrE9ipJhwJnAPOB421fKeloYLntJcC7gS9JehfVYOOb7O5TKJOsIyJgoJNibC+lnlvSce2Ijq+vAnabyTOTrCMioPUzGJOsIyJgzi7kFBFRlrSsIyIKMLjSvaFIso6IgIFVgwxLknVEBOB0g0REFCDdIBERBWhoI9x+JVlHREBa1hERRViVAcaIiPZLN0hERAHSDRIR0X4p3YuIKEFa1hERBUiyjogoQKabR0S03wD3YByKJOuICEg3SEREEVINEhFRgLSsIyIKkGQdEdF+Hks3yGpZb/O/bjqEGdnvMbs0HcKMPZmHNh3CjH3wprOaDmFGjn3k85oOYcaOvOfipkOYsT/ccfWaPyQt64iI9kvpXkRECZKsIyIK0O4u6yTriAgAr2p3tk6yjoiAtKwjIkqQAcaIiBKkZR0R0X5pWUdElCAt64iI9vOqpiPoLsk6IgJwy1vW85oOICKiFcZncPQgaU9J10haIenwaV7zGklXSbpS0j/3emZa1hERDK5lLWk+cCywB3ADsEzSEttXdbxmIfB+YDfbt0t6ZK/npmUdEUGVrPs9etgZWGH7WtsrgZOAfSe95mDgWNu3A9i+uddDk6wjIgCPqe9D0iJJyzuORR2P2gK4vuP8hvpap22BbSWdL+nnkvbsFV+6QSIimFk3iO3FwOI1eLu1gIXA7sAC4FxJT7V9R7dviIiY8zyuQT3qRmDLjvMF9bVONwAX2r4P+I2kf6dK3sume2i6QSIiGGif9TJgoaStJa0DvA5YMuk1p1C1qpG0KVW3yLXdHpqWdUQEYA+mZW17laRDgTOA+cDxtq+UdDSw3PaS+t6LJF0FjAHvtf2nbs9Nso6IYLCTYmwvBZZOunZEx9cGDquPviRZR0QA42MD67MeiiTriAgGOsA4FEnWERG0P1n3VQ0iaTdJ69dfHyDpGElbDTe0iIjRsfs/mtBv6d4XgL9IejrwbuDXwAlDiyoiYsQ8rr6PJvSbrFfVo5f7Av9k+1hgw+GFFRExWrb6PprQb5/13ZLeDxwA/I2kecDawwsrImK0xlpeDdJvy/q1wL3AQbb/QDV98pNDiyoiYsRmRcu6TtDHdJxfR/qsI2IWaXs1SNdkLeluYKqxT1FNwtloKFFFRIxYU1Ue/eqarG1nEDEi5oSiW9adJD0HWGj7q/UqURva/s3wQouIGJ2x8XYvQtpXspZ0JLAT8ETgq8A6wDeB3YYXWkTE6BTdDdLh5cAOwCUAtn8vKV0kETFrjDdU5dGvfpP1StuWZICJqefdSNqOahLNxN5jNwJLbF+9WpFGRAxRUyV5/eq3k+ZkSccBD5N0MHAm8KXpXizp76h29BVwUX0IOFHS4WsWckTE4LV9bZB+66w/JWkP4C6q7WeOsP3jLt9yEPDken+x+0k6BrgS+PupvqneIXgRgOZvzLx5PRvwEREDMVu6QQCuANajqru+osdrx4HNgd9Nuv6Y+t6UOncMXmudLVre3R8Rs8lsqQZ5C3AE8FOq7ozPSTra9vHTfMs7gZ9I+hVwfX3tscATgEPXKOKIiCFoe+uw35b1e4EdJjZ0lPQI4AJgymRt+3RJ2wI78+ABxmW2x9Ys5IiIwZst3SB/Au7uOL+7vjYt2+PAz1czroiIkWp7NUivtUEmdt5dAVwo6QdUnxb2BS4fcmwRESMzwM3Nh6JXy3pi4suv62PCD4YTTkREM0zBLWvbHx5VIBERTVpVcjfIBEmbAe8DngysO3Hd9vOHFFdExEi1vWXdb2Hht4BfAlsDHwZ+CywbUkwRESM3PoOjCf0m60fY/gpwn+1zbL8ZSKs6ImYNo76PJvRbujcxbfwmSXsDvwc2GU5IERGjV3o1yISPStoYeDfwOWAjqlmKERGzwljL+6z7Xcjp1PrLO4HnAUh655BiiogYuZbv6tV3n/VUDuv9koiIMoyjvo8mzGTVvcla/nsoIqJ/s2Uhp6m0/e8WEdG3ogcYJd3N1ElZVGtbR0TMCuNqd2dBr+nm2RQ3IuaEtq/d3O6tESIiRmRc/R+9SNpT0jWSVnTbd1bSKyVZ0k69nrkmfdYREbPGoKo8JM0HjgX2AG4AlklaYvuqSa/bEHgHcGE/z03LOiKCanCu36OHnYEVtq+1vRI4iWoPgMk+Anwc+M9+4kuyjohgZt0gkhZJWt5xLOp41BY8sPcsVK3rLTrfS9KOwJa2T+s3vnSDREQws9I924uBxavzPpLmAccAb5rJ9yVZR0QAY4Or3LsR2LLjfEF9bcKGwFOAs1WVCz4aWCJpH9vLp3toknVEBAOdFLMMWChpa6ok/Tpg/4mbtu8ENp04l3Q28J5uiRrSZx0RAQxu8wHbq4BDgTOAq4GTbV8p6WhJ+6xufGlZR0QAg9yC0fZSYOmka0dM89rd+3lmknVEBIWvDRIRMVe0fbp5knVEBO3ffCDJOiKCdINERBQhyToiogBt300lyToigvRZR0QUIdUgq2nvR+/QdAgzcuJNfS1JG2to1822azqEGTnk5rOaDiH6NN7yjpDWJuuIiFHKAGNERAH
|
||
|
"text/plain": [
|
||
|
"<Figure size 432x288 with 2 Axes>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {
|
||
|
"needs_background": "light"
|
||
|
},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"sns.heatmap(scaled_means)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"**TASK: Create another heatmap similar to the one above, but with the outliers removed**"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 125,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"<AxesSubplot:ylabel='Labels'>"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 125,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
},
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXIAAAFHCAYAAAC8k8cXAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAuv0lEQVR4nO3deZyVdd3/8dd7hkEWxQ0QWVRUXDM118oSMxQlQVvc79vKJLc7zbpNyy20fmXl3XKTSmlZd4qWLagouK+B4JIEigIurMqigLLN8vn9cV3AmXGWMzAz17lm3k8f14Nzbef6zDyOn/me76qIwMzM8qss6wDMzGzzOJGbmeWcE7mZWc45kZuZ5ZwTuZlZznXKOoCGdOrcL1fdad4+dvesQ2i27peNzDqEZtvyqEuzDqFZVi94MusQmu0z+5+TdQjN9uT8h7W571G5ZE7ROaei566b/byW5BK5mVnOlWyJ3MysTdVUZx3BJnMiNzMDqK7KOoJN5kRuZgZE1GQdwiZzIjczA6hxIjczyzeXyM3Mcs6NnWZmOecSuZlZvoV7rZiZ5ZwbO83Mcs5VK2ZmOefGTjOznHOJ3Mws53Lc2OnZD83MIGnsLHZrgqShkmZKmiXpsnrOnytpmqQXJT0laZ/0+C6SVqfHX5R0UzGhu0RuZgZEtEwduaRyYDQwBJgHTJE0LiJmFFx2e0TclF4/HLgBGJqemx0RBzTnmS6Rm5lBUkde7Na4Q4FZETEnItYBY4ERtR4VsaJgtzuwWQvpOJGbmUGzqlYkjZQ0tWArXG6rHzC3YH9eeqwWSRdImg1cD3yj4NRASS9IelzSp4oJ3VUrZmbQrF4rETEGGLNZj4sYDYyWdDpwBXAWsBDYKSKWSjoI+LukfeuU4D/EidzMDKC6sqXeaT4woGC/f3qsIWOBGwEiYi2wNn39XFpi3wOY2tgDXbViZgYt2WtlCjBI0kBJnYFTgXGFF0gaVLA7DHgtPd4rbSxF0q7AIGBOUw90idzMDFpsQFBEVEm6EJgAlAO3RsR0SaOAqRExDrhQ0meBSuBdkmoVgE8DoyRVAjXAuRGxrKlnOpGbmUGLTpoVEeOB8XWOXVXw+qIG7rsbuLu5z3MiNzMDz35oZpZ30XKNnW3OidzMDDxplplZ7rlqxcws51wiNzPLOZfIzcxyziVyM7Ocq8rvwhJO5GZm4BK5mVnuuY7czCznXCI3M8s5l8jNzHLOJXIzs5xzrxUzs5yLzVr/OFNO5GZm4DpyM7PccyI3M8s5N3aameVcdXXWEWyysqwDMDMrCTU1xW9NkDRU0kxJsyRdVs/5cyVNk/SipKck7VNw7vL0vpmSji0mdJfIzcygxerIJZUDo4EhwDxgiqRxETGj4LLbI+Km9PrhwA3A0DShnwrsC/QFHpK0R0Q0+nXBJXIzM0jqyIvdGncoMCsi5kTEOmAsMKLWoyJWFOx2B9b3fRwBjI2ItRHxOjArfb9GuURuZgZETfH9yCWNBEYWHBoTEWPS1/2AuQXn5gGH1fMeFwCXAJ2BzxTcO6nOvf2aiseJ3MwMmlW1kibtMU1e2Ph7jAZGSzoduAI4a1Pfy4nczAxastfKfGBAwX7/9FhDxgI3buK9gOvIzcwSLddrZQowSNJASZ1JGi/HFV4gaVDB7jDgtfT1OOBUSVtIGggMAp5t6oEukZuZQYv1WomIKkkXAhOAcuDWiJguaRQwNSLGARdK+ixQCbxLWq2SXncXMAOoAi5oqscKdLBEfuwxg7nhhlGUl5Vx6+/u4PqfjM46JCo+dijdz/kvKCtjzYP3seYvt9c632XEyWxxzDCoriZWvMf7v/gxNYvf3nBeXbux9a9vo3LSU3xw8y9aPd6np83ix7dPoCZqOOlTB3L2sCNqnb/r0anc+chUystE1y06c9VZn2O3fr02nF+4dDknXfFrzhtxJGcN/USrx1uMUvxcPDVpKj/6+U1U19TwhROG8rX/OLnW+Tv/dh9j/3ovZWVldOvWhWsu/Qa7DdyZyqoqrv5/P+flV2dTVV3N8KFHc85/ntLq8R46+BAuGnUBZWVl3HvHeP40emyt8xWdK/jeL77DnvvtwYp3V3D1edeyaF7yOd5t71359o+/Sfctu1FTU8PIYeezbm1lq8f8IS04aVZEjAfG1zl2VcHrixq59wfAD5rzvA6TyMvKyvjlL37A0ONPY968hUz653juuXciL7/8WtM3t15QdD/3YlZc+S1qli5m6xtupnLy01TPfXPDJVVzXmPNJSNh7Vq2OG4E3b5yLu9f//0N57ueeTZV019qk3Cra2r44f/dz83fOpMdtuvB6aN+y+AD9qyVqI8/fD9OPupgAB57YSY/vXMiN15yxobzPx07kSP2271N4i1GKX4uqqurue5no/nNz39In949OeVrF3HUEYex28CdN1wz7JjBnHLSMAAefXIS1//qN9x8w3VMfORJ1lVW8rc/3sjqNWsYccbXOX7IYPrtuEOrxVtWVsYlP/gG3zztUhYvXMxvxv+apyf+kzde2/g5Hnbacaxc/j6nHfGfHD38KM793jlcc951lJeXceUvL+fai/4fs2fMoce2PaiqzGiEZY7nWmm1OnJJe0n6jqRfptt3JO3dWs9ryqGHHMjs2W/w+utvUVlZyV13/YPhJxQ1aKrVdBq0N9UL51Pz9kKoqmLtE49QcVjtEm7VtBdg7drk9cwZlG2/MWmW77YHZdtsS+ULU9ok3n/Pmc+A3tvSv/e2VHQqZ+hh+/LYizNrXbNl1y02vF69thIVnHvk+Vfo12sbduvbi1JRip+LaS+/yk79+zKg345UVFRw3NFH8siTk2pds2X37hter16zBin5TUti9Zo1VFVVs3btOioqKtiye7dWjXfvA/di/hvzWfjWQqoqq3j4H49yxLG1v2196phP8MCfJwLw2H2Pc9ARHwPgkCMPZvbLc5g9Yw4AK95dQU1WCbUmit9KTKskcknfIWmJFUlF/bPp6zvqG67aFvr268PceQs27M+bv5C+fftkEcoGZdv3pGbJOxv2a5Yupnz7ng1e32XI8VQ+NznZkeh+9vmsuvXGBq9vae+8t5I+2229Yb/3tj14+92VH7pu7MNTGPadX/E/f36I75wxFIBVa9bxu/uf5tzhR7ZZvMUoxc/FO4uX0Kf3xj92O/TuyTuLl37oujvuvoehX/oKP/v1LVx+8bkADDnqCLp26cJRI05nyOf/ky+f9nm27rFVq8bbq09P3lmweMP+4oWL6dmn9ue4Z5+evLMg+axXV9fwwYoP2HrbHgzYtT9B8LM//YhbHriJ089r/WqgBlVXF7+VmNaqWjkb2DcialV0SboBmA78qL6bCjvZq3xrysq613dZh9R58BDKd9+TDy5Pqta6HH8i66ZOpmbp4ibubHunHn0Ipx59COMnTeM39zzJdV87kRv/8RhnDjmcbl06Zx1eu3HaF07gtC+cwH0TH+Xm39/BD6/8NtNmzKS8rIxH/vEnVqx8n7PO+zaHH3wgA/rtmHW49SovL2e/Qz7CyOPPZ83qtfz8rp8yc9qrPPfUC20eS+S4aqW1EnkNyTwBb9Y5vmN6rl6Fnew7de7Xot9fFsxfxID+fTfs9++3IwsWLGrJRzRbzdIllPXsvWG/bPteVC9d8qHrKvY/iK4n/wcrLv8GVCV/GzvttS+d9v0oXY4fgbp2hU4VxJrVrLpts8YoNKr3NluxaNnyDfvvvLuCHbZtuLQ39NCP8IM/Ju090+bM56GpL/PzPz/EylVrUJnoXNGJ045ucvRxqyrFz0XvXj1Z9M7GP9Bvv7OE3r22b/D64z57JNf+9H8BGP/gY3zy8IOp6NSJ7bfdhgM+ug/TX3mtVRP54kVL6F1QXdZrx14sWVT7c7xk0RJ69+3N4oVLKC8vo3uP7ix/dwWLFy7hX5OnsfzdZMT6pEcms8dHBmWSyEuxyqRYrVVHfjHwsKT7JY1JtweAh4EGW2tb05SpL7L77gPZZZcBVFRUcPLJI7jn3olZhLJB1WuvUN63P2U79IFOndji05+h8tmna11Tvusgul/wLVZeezmx/L0Nx9//2XW899W
|
||
|
"text/plain": [
|
||
|
"<Figure size 432x288 with 2 Axes>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {
|
||
|
"needs_background": "light"
|
||
|
},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"sns.heatmap(scaled_means.loc[[0,1]],annot=True)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"**TASK: What spending category were the two clusters mode different in?**"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 126,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"#CODE HERE"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"We can see that Detergents Paper was the most significant difference."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": []
|
||
|
}
|
||
|
],
|
||
|
"metadata": {
|
||
|
"anaconda-cloud": {},
|
||
|
"kernelspec": {
|
||
|
"display_name": "Python 3",
|
||
|
"language": "python",
|
||
|
"name": "python3"
|
||
|
},
|
||
|
"language_info": {
|
||
|
"codemirror_mode": {
|
||
|
"name": "ipython",
|
||
|
"version": 3
|
||
|
},
|
||
|
"file_extension": ".py",
|
||
|
"mimetype": "text/x-python",
|
||
|
"name": "python",
|
||
|
"nbconvert_exporter": "python",
|
||
|
"pygments_lexer": "ipython3",
|
||
|
"version": "3.8.5"
|
||
|
}
|
||
|
},
|
||
|
"nbformat": 4,
|
||
|
"nbformat_minor": 1
|
||
|
}
|