You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

1082 lines
451 KiB

2 years ago
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"___\n",
"\n",
"<a href='http://www.pieriandata.com'><img src='../Pierian_Data_Logo.png'/></a>\n",
"___\n",
"<center><em>Copyright by Pierian Data Inc.</em></center>\n",
"<center><em>For more information, visit us at <a href='http://www.pieriandata.com'>www.pieriandata.com</a></em></center>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# DBSCAN Hyperparameters\n",
"\n",
"\n",
"Let's explore the hyperparameters for DBSCAN and how they can change results!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## DBSCAN and Clustering Examples"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt\n",
"import seaborn as sns"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [],
"source": [
"two_blobs = pd.read_csv('../DATA/cluster_two_blobs.csv')\n",
"two_blobs_outliers = pd.read_csv('../DATA/cluster_two_blobs_outliers.csv')"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:xlabel='X1', ylabel='X2'>"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAEGCAYAAABsLkJ6AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAA8dUlEQVR4nO3deWCU1bn48e+ZyUwmCUkIgZCYkEBMIBA2IVr0Aq2JInppUVzbe7ELys9WDNW6r9e1Wlu8IrSWurTa9gotVqtFqgK3wFXUgOxbYiSQGCCE7Mkkk8z5/TF5X2aSmZAgYRLm+fyV2d45iXKe9zznnOcorTVCCCFCjyXYDRBCCBEcEgCEECJESQAQQogQJQFACCFClAQAIYQIUWHBbkBPDB48WA8fPjzYzRBCiH5l8+bNx7TWQzo+368CwPDhwykoKAh2M4QQol9RSpX4e15SQEIIEaIkAAghRIiSACCEECFKAoAQQoQoCQBCCBGi+tUqICGEOFPcbs2BygaO1DoZGuNgeHwUFosKdrNOKwkAQgjRgdutWb3rMHes2IrT5cZhs7DouonMzE48q4KApICEECHH7dYUV9Tz8RfH2Haoms8OVFJcUY/b7SmPf6Cywez8AZwuN3es2MqByoZgNvu0kxGAECKk+Lu7z8/NZHnBQe6ZOZqZ2YkcqXWanb/B6XJztM5J+pABQWr56ScjACFESPF3d794bSGzxiebd/lDYxw4bL7do8NmISHaEYwm9xoJAEKIkBLo7l4piIu0U1HXzJFaJ7+bm0NafASAOQcwPD4qGE3uNZICEkKEFOPu3jsIOGwWouxWbrwwje+/+qmZGnrm6vEkD3QwKCr8rFwFJCMAIcRZz3vSV2tY8r3zzBSPMQcA8PyaQp/U0KIP9mGzWjhS6+RAZYM5SdzV9b0nk/s6GQEIIfq1k63XD7Skc/XCaVTUN6NQHG9oZoAjjLhIO+U1TgCSYh1cn5PK9cs2+YwIzhnoIN5rRNDa6uaj4koKSo7j1vDOtjJzMrmvjxgkAAgh+rxAnXx31usHWtL5j9umUVHX4vPZhXmZvPZxCeU1TuZMSmF5wUHmTU1Htffjiz7Yx92XZbGx8BjjUmL5VmYC/9hZzj0rt/usKHpm9R6yEqP7/IohSQEJIfo0o5O/YvEGvvu7T7hi8QZW7zpsBoWTrdcPNOl78Hjnzz6/ppAfXpTGnTNGMipxAPfOHM0nxRVoDRYF984cTWVDM79dX0zhkXp2l9eYnb9xDWNF0dE6p8/v0BdTRDICEEL0aYE6+az8ad1arx9o0helfFI+4FkFFBcVzkNv7zTv6B+Zlc2L64soqWzCYbPw+OyxxEXaeX5NIemDo/x+v9WCuWTU3yjFXyopGGQEIITo07rq5LuzXn94fBTPXD2+06Tvf/19J9fmpPh89tqcFLPzN77n0Xd3MWt8svn4obd3MmdSCk6XG5db+/3+nLRB5pJRfwHsnpXb+d99x3xGM8EgAUAI0ad0TJckRAfu5IfHR7Houok+nXvH9foWi+KcgQ7mTU1nQW4G86am8/qmEkoqm8gYMsDns8PiIgPuEej42GGzcOh4I/m5mT7XeObq8VyUHm/e1Xe176CrEhNnIm0kKSAhRJ/hL12y5Hvnsei6iZ0meo3UyYzRQ1k+fwrlNU6SYiPITooBoLii3pw0Dg+z8vLG4k5poISYcJ65ejxxkTYOHGvgaJ3Tb7pIe/W9DpsFi4KHZo1hydoiAHOieFrGYM4fPsgnpRMoBWVc01+JiTNVjE4CgBCiz/CXLlnw589ZvXAaq/KncbTOad75G6uA3t9zxKej/MXV44kMt7Lgz5+bzz1/w3k8edU4HvjbDvO5R7+TzT0rt5u5/UdmZfPFkSoe+XY2j76zy+d9v/5fT0fvsFl4eNYYappcxDhOdJ8RNgtDosNpc3sml1MHncjrp8ZF8sSVY3nwrRPzCvm5mby+qcS8ZscSE13Ne5zOlUVBCwBKqWHAa8BQQAPLtNbPB6s9QojgC5QuOVzrZEr64E6dn7+O8u6V25k/Pd18Li7STnFFPVZ14k79/OFxvLBmP7PGJ5vpnZVbDnL9+WksWVfIvKnpWC0wcdhAhgywc9/M0dQ4XUTaw/jd+i/Yf7Se+dPTufXic3G63Ly+6QCzxidT09hCm4bd5bWMGhrDiMFRlFY3cry+mWevmUBzaxsDI2wUV9Rz9eQUrApGJ8VQ2dAMYAa2M1WMLpgjgFbgZ1rrLUqpaGCzUuoDrfXuILZJCBFEgdIlgYqwBeoovdPlcyal8PyaQm6alm6mgZ6/fgK5WYksXuvZ+ZsWH8ETs8dx8+sFOF1ulq47ccf/6+9N4qcd9gpcbbPy4vpirs1J4e2tZVyfk8rygoNcn5PKza8V+KRttNYs+vDE9/zkWxnmY4fNwh2XjuTRd3ZT1dhipnl6+nc4VUGbBNZal2utt7T/XAfsAZKD1R4hRPB1Z1LXW6BVQN5pcmOydeXmUnPCdmCU3ez8k2Id/OiiEdQ0ubhpmmeiOCnW09E6XW62llZ32itQ19xKeY0Tt4ZZ45PNtf/GNY333rFiK/uO1Jnf87MZWZRVN3HTtHSSYh3t5Sb2m6uKnlm9hx1l1WesGF2fmANQSg0HzgM+8fPafGA+QGpq6pltmBDijLJYFDOzE8nyk+/3x1ji6b0T984ZoxgUZTPvoGPCreTnZeDWYLHAgoszOF7fYnbU86eNoNHVxs//uq1Tjr6qsYXMhGjuvXwUYRZFdLiNyPAw7GEWkmIdWBXYwiw+q3q8OV1uwiye986dksbdfr6jvMaJUoFLT/RmMTqldXB3pCmlBgD/Ap7UWr/Z1XtzcnJ0QUHBmWmYEKLPam11s6u8hvIaJylxEdQ6XZQeb+JoXTPR4WG8s72MGy9KR2s3bW6432vyd2FeJhlDBvDEqt1cO3kY6UOi2He4jhUFpeamMIfNwvzp6UTYrLy3o5zLxyWZheKMa0Q7wshMiKLVDfP+UMBN09J5aUPnlUb/ff1Eio7W85fNh3zmHN7ZVsas8cm8vLGYeVPTAfyuVFp1GiZ+lVKbtdY5nZ4PZgBQStmAd4F/aq0Xnez9EgCEEK2tbt7aVuazqubx2WMZnxLNP3YcMXPyi9d6JnONTjUp1sGcSSlYLTA5dSDlNc38l9dqH+87coAXvnseT63aw5xJKX475l9eM4GGllbio+yUVTXxykdf8qOLRlDZ2IJbg1XBoEg7r3z0JXdcOpKyKifLCw4ya3wyVguMOycWqwWON7o4UuvEZlE89d6+Tr/v/9z8DS48d/DX+psFCgBBmwNQSingZWBPdzp/IYQA2FVeY3b+cGJ37vGGVkYNjWbW+GSziFtqXAQ3TUtnfHIMc6ek8fLGYhavKWLzwWqz8zeusXhtIXMmpZAU6yA/LwOnq42rJ6cQHmbxm9rZf7SOe1buYMH/fI7dZuW+mVlEOcJ4e2sZWnuWNg6NdRDrsHHOwAiUggUXZxJmgb8UlHLbG5+z70g9v3p/Py+sLSIuKtzM+RscNguuNt1rO4WDOQfwb8BcYIdSamv7c/drrVcFr0lCiL6uvMb/yp9DxxsZlRhDWVWjOQIw7u4fmjWGZeu/MD/n1v7z9eFhFm68MM0n3bP4hvP8rshpc5/43KPv7OIX10zghbWFnb77yavGcuBYo881jdHG82s8o5Sl64p46O2d/OY/J/PjP272ed9Db+/g1R9c0CuVRYMWALTWG4G+XSxbCNHnJMVG+O2Qh8Y6OFzrJDs5lnl/KPC5u3/83d1mR+v9mY7XmDAslv/3+mafzz65anenzWH5uZms3lnOrRdnmDl9rd1+VwKVVDaybH1xp9GG0R7j806XG6erzdyroDVmSqqki/MOvo4+sQpICCG6Kzsphsdnj/Wp2PnYd7J5Yc1+CkpquP/yrIAVOg0rN5eyMC+z0135nvLaTp9tadXmpLBbQ9bQaH7/UTEzxyb53Ok/ceVYYh3WTp8PNNow6gkZ07AOm4WkGAe3L9/aKTB9fqiaxWuKTntJCAkAQoh+JSzMwuzx55ASF0FpVRMpcRH86v29FJTUAOByu/3e3Y9
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.scatterplot(data=two_blobs,x='X1',y='X2')"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:xlabel='X1', ylabel='X2'>"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAEGCAYAAABsLkJ6AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAA9iUlEQVR4nO3deXzU1bn48c+ZySSTnRASEhOSEBMIhE2IFv0BrYki+qNFcW3vxS4oP1sxVOpel+t6tbZ4Rei1uN3qba/SYrVapCrQAldRA7JvCZFAYiAhZE8mmWTO74/J98tMMgkJEiZhnvfr1deL2b5zkprzfM9zznmO0lojhBAi8Fj83QAhhBD+IQFACCEClAQAIYQIUBIAhBAiQEkAEEKIABXk7wb0xbBhw3RaWpq/myGEEIPKli1bjmut4zo/P6gCQFpaGgUFBf5uhhBCDCpKqRJfz0sKSAghApQEACGECFASAIQQIkBJABBCiAAlAUAIIQLUoFoFJIQQZ4vLpTlU1cixOgfDo+ykxYZjsSh/N+uMkgAghBCduFyaNbuPsnjlNhxOF3abhSU3TGJWdsI5FQQkBSSECDgul6a4soFPDx5n+5EavjhURXFlAy6Xuzz+oapGs/MHcDhdLF65jUNVjf5s9hknIwAhREDxdXefn5vJWwWHuXfWGGZlJ3CszmF2/gaH00VFvYP0uAg/tfzMkxGAECKg+Lq7X7qukNkTksy7/OFRduw27+7RbrMQH2n3R5P7jQQAIURA6e7uXimICQumsr6FY3UOXpqXQ2psKIA5B5AWG+6PJvcbSQEJIQKKcXfvGQTsNgvhwVZuvjiVH772uZkaeubaCSQNsTM0POScXAUkIwAhxDnPc9JXa1j2gwvMFI8xBwDw/NpCr9TQko/2Y7NaOFbn4FBVozlJ3NP1PSeTBzoZAQghBrVTrdfvbknnmkXTqWxoQaE40dhChD2ImLBgymsdACRG27kxJ4UbV2z2GhGcN8ROrMeIoK3NxSfFVRSUnMCl4b3tZeZk8kAfMUgAEEIMeN118r1Zr9/dks6/3TGdyvpWr88uysvk9U9LKK91MHdyMm8VHGb+tHRURz++5KP93HNFFpsKjzM+OZrvZMbzt13l3Ltqh9eKomfW7CUrIXLArxiSFJAQYkAzOvmrlm7k+y99xlVLN7Jm91EzKJxqvX53k76HT3T97PNrC/nxJancNXMUoxMiuG/WGD4rrkRrsCi4b9YYqhpb+N2GYgqPNbCnvNbs/I1rGCuKKuodXj/DQEwRyQhACDGgddfJZ+VP79V6/e4mfVHKK+UD7lVAMeEhPPTuLvOO/pHZ2by4oYiSqmbsNguPzxlHTFgwz68tJH1YuM/vt1owl4z6GqX4SiX5g4wAhBADWk+dfG/W66fFhvPMtRO6TPr+2193cX1Ostdnr89JNjt/43sefX83syckmY8fencXcycn43C6cLq0z+/PSR1qLhn1FcDuXbWDf+w/7jWa8QcJAEKIAaVzuiQ+svtOPi02nCU3TPLq3Duv17dYFOcNsTN/WjoLczOYPy2dNzaXUFLVTEZchNdnR8SEdbtHoPNju83CkRNN5Odmel3jmWsncEl6rHlX39O+g55KTJyNtJGkgIQQA4avdMmyH1zAkhsmdZnoNVInM8cM560FUymvdZAYHUp2YhQAxZUN5qRxSJCVVzYVd0kDxUeF8My1E4gJs3HoeCMV9Q6f6SLt0ffabRYsCh6aPZZl64oAzIni6RnDuDBtqFdKp7sUlHFNXyUmzlYxOgkAQogBw1e6ZOEfv2TNoumszp9ORb3DvPM3VgF9uPeYV0f5q2snEBZiZeEfvzSfe/6mC3jymvH88i87zece/V42967aYeb2H5mdzcFj1Tzy3WwefW+31/t++w93R2+3WXh49lhqm51E2U92n6E2C3GRIbS73JPLKUNP5vVTYsJ44upxPPjOyXmF/NxM3thcYl6zc4mJnuY9zuTKIr8FAKXUCOB1YDiggRVa6+f91R4hhP91ly45WudgavqwLp2fr47ynlU7WDAj3XwuJiyY4soGrOrknfqFaTG8sPYAsyckmemdVVsPc+OFqSxbX8j8aelYLTBpxBDiIoK5f9YYah1OwoKDeGnDQQ5UNLBgRjq3X3o+DqeLNzYfYvaEJGqbWmnXsKe8jtHDoxg5LJzSmiZONLTw7HUTaWlrZ0iojeLKBq6dkoxVwZjEKKoaWwDMwHa2itH5cwTQBvxCa71VKRUJbFFKfaS13uPHNgkh/Ki7dEl3Rdi66yg90+VzJyfz/NpCbpmebqaBnr9xIrlZCSxd5975mxobyhNzxnPrGwU4nC6Wrz95x//bH0zm5532Clxrs/LihmKuz0nm3W1l3JiTwlsFh7kxJ4VbXy/wSttorVny8cnv+dl3MszHdpuFxZeP4tH39lDd1Gqmefr6ezhdfpsE1lqXa623dvy7HtgLJPmrPUII/+vNpK6n7lYBeabJjcnWVVtKzQnbIeHBZuefGG3nJ5eMpLbZyS3T3RPFidHujtbhdLGttKbLXoH6ljbKax24NMyekGSu/Teuabx38cpt7D9Wb37PL2ZmUVbTzC3T00mMtneUmzhgrip6Zs1edpbVnLVidANiDkAplQZcAHzm47UFwAKAlJSUs9swIcRZZbEoZmUnkOUj3++LscTTcyfuXTNHMzTcZt5BR4VYyc/LwKXBYoGFl2ZwoqHV7KgXTB9Jk7Odf//z9i45+uqmVjLjI7nvytEEWRSRITbCQoIIDrKQGG3HqsAWZPFa1ePJ4XQRZHG/d97UVO7x8R3ltQ6U6r70RH8Wo1Na+3dHmlIqAvgn8KTW+u2e3puTk6MLCgrOTsOEEANWW5uL3eW1lNc6SI4Jpc7hpPREMxX1LUSGBPHejjJuviQdrV20u+ABj8nfRXmZZMRF8MTqPVw/ZQTpceHsP1rPyoJSc1OY3WZhwYx0Qm1WPthZzpXjE81CccY1Iu1BZMaH0+aC+b8v4Jbp6by8setKo/+4cRJFFQ38acsRrzmH97aXMXtCEq9sKmb+tHQAnyuVVp+BiV+l1BatdU6X5/0ZAJRSNuB94O9a6yWner8EACFEW5uLd7aXea2qeXzOOCYkR/K3ncfMnPzSde7JXKNTTYy2M3dyMlYLTEkZQnltC//msdrH844c4IXvX8BTq/cyd3Kyz47519dNpLG1jdjwYMqqm3n1k6/4ySUjqWpqxaXBqmBoWDCvfvIViy8fRVm1g7cKDjN7QhJWC4w/LxqrBU40OTlW58BmUTz1wf4uP+//3PotLj5/2Df6nXUXAPw2B6CUUsArwN7edP5CCAGwu7zW7Pzh5O7cE41tjB4eyewJSWYRt5SYUG6Zns6EpCjmTU3llU3FLF1bxJbDNWbnb1xj6bpC5k5OJjHaTn5eBg5nO9dOSSYkyOIztXOgop57V+1k4f98SbDNyv2zsgi3B/HutjK0di9tHB5tJ9pu47whoSgFCy/NJMgCfyoo5Y43v2T/sQZ+8+EBXlhXREx4iJnzN9htFpztut92CvtzDuD/APOAnUqpbR3PPaC1Xu2/JgkhBrryWt8rf46caGJ0QhRl1U3mCMC4u39o9lhWbDhofs6lfefrQ4Is3Hxxqle6Z+lNF/hckdPuOvm5R9/bza+um8gL6wq7fPeT14zj0PEmr2sao43n17pHKcvXF/HQu7v4z3+dwk//e4vX+x56dyev/eiifqks6rcAoLXeBAzsYtlCiAEnMTrUZ4c8PNrO0ToH2UnRzP99gdfd/ePv7zE7Ws/PdL7GxBHR/L83tnh99snVe7psDsvPzWTNrnJuvzTDzOlr7fK5EqikqokVG4q7jDaM9hifdzhdOJzt5l4FrTFTUiU9nHfwTQyIVUBCCNFb2YlRPD5nnFfFzse+l80Law9QUFLLA1dmdVuh07BqSymL8jK73JXvLa/r8tnWNm1OCrs0ZA2P5L8+KWbWuESvO/0nrh5HtN3a5fPdjTaMekLGNKzdZiExys6db23rEpi+PFLD0rVFZ7wkhAQAIcSgEhRkYc6E80iOCaW0upnkmFB+8+E+CkpqAXC6XD7
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.scatterplot(data=two_blobs_outliers,x='X1',y='X2')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Label Discovery"
]
},
{
"cell_type": "code",
"execution_count": 75,
"metadata": {},
"outputs": [],
"source": [
"def display_categories(model,data):\n",
" labels = model.fit_predict(data)\n",
" sns.scatterplot(data=data,x='X1',y='X2',hue=labels,palette='Set1')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## DBSCAN"
]
},
{
"cell_type": "code",
"execution_count": 76,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.cluster import DBSCAN"
]
},
{
"cell_type": "code",
"execution_count": 77,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Help on class DBSCAN in module sklearn.cluster._dbscan:\n",
"\n",
"class DBSCAN(sklearn.base.ClusterMixin, sklearn.base.BaseEstimator)\n",
" | DBSCAN(eps=0.5, *, min_samples=5, metric='euclidean', metric_params=None, algorithm='auto', leaf_size=30, p=None, n_jobs=None)\n",
" | \n",
" | Perform DBSCAN clustering from vector array or distance matrix.\n",
" | \n",
" | DBSCAN - Density-Based Spatial Clustering of Applications with Noise.\n",
" | Finds core samples of high density and expands clusters from them.\n",
" | Good for data which contains clusters of similar density.\n",
" | \n",
" | Read more in the :ref:`User Guide <dbscan>`.\n",
" | \n",
" | Parameters\n",
" | ----------\n",
" | eps : float, default=0.5\n",
" | The maximum distance between two samples for one to be considered\n",
" | as in the neighborhood of the other. This is not a maximum bound\n",
" | on the distances of points within a cluster. This is the most\n",
" | important DBSCAN parameter to choose appropriately for your data set\n",
" | and distance function.\n",
" | \n",
" | min_samples : int, default=5\n",
" | The number of samples (or total weight) in a neighborhood for a point\n",
" | to be considered as a core point. This includes the point itself.\n",
" | \n",
" | metric : string, or callable, default='euclidean'\n",
" | The metric to use when calculating distance between instances in a\n",
" | feature array. If metric is a string or callable, it must be one of\n",
" | the options allowed by :func:`sklearn.metrics.pairwise_distances` for\n",
" | its metric parameter.\n",
" | If metric is \"precomputed\", X is assumed to be a distance matrix and\n",
" | must be square. X may be a :term:`Glossary <sparse graph>`, in which\n",
" | case only \"nonzero\" elements may be considered neighbors for DBSCAN.\n",
" | \n",
" | .. versionadded:: 0.17\n",
" | metric *precomputed* to accept precomputed sparse matrix.\n",
" | \n",
" | metric_params : dict, default=None\n",
" | Additional keyword arguments for the metric function.\n",
" | \n",
" | .. versionadded:: 0.19\n",
" | \n",
" | algorithm : {'auto', 'ball_tree', 'kd_tree', 'brute'}, default='auto'\n",
" | The algorithm to be used by the NearestNeighbors module\n",
" | to compute pointwise distances and find nearest neighbors.\n",
" | See NearestNeighbors module documentation for details.\n",
" | \n",
" | leaf_size : int, default=30\n",
" | Leaf size passed to BallTree or cKDTree. This can affect the speed\n",
" | of the construction and query, as well as the memory required\n",
" | to store the tree. The optimal value depends\n",
" | on the nature of the problem.\n",
" | \n",
" | p : float, default=None\n",
" | The power of the Minkowski metric to be used to calculate distance\n",
" | between points.\n",
" | \n",
" | n_jobs : int, default=None\n",
" | The number of parallel jobs to run.\n",
" | ``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.\n",
" | ``-1`` means using all processors. See :term:`Glossary <n_jobs>`\n",
" | for more details.\n",
" | \n",
" | Attributes\n",
" | ----------\n",
" | core_sample_indices_ : ndarray of shape (n_core_samples,)\n",
" | Indices of core samples.\n",
" | \n",
" | components_ : ndarray of shape (n_core_samples, n_features)\n",
" | Copy of each core sample found by training.\n",
" | \n",
" | labels_ : ndarray of shape (n_samples)\n",
" | Cluster labels for each point in the dataset given to fit().\n",
" | Noisy samples are given the label -1.\n",
" | \n",
" | Examples\n",
" | --------\n",
" | >>> from sklearn.cluster import DBSCAN\n",
" | >>> import numpy as np\n",
" | >>> X = np.array([[1, 2], [2, 2], [2, 3],\n",
" | ... [8, 7], [8, 8], [25, 80]])\n",
" | >>> clustering = DBSCAN(eps=3, min_samples=2).fit(X)\n",
" | >>> clustering.labels_\n",
" | array([ 0, 0, 0, 1, 1, -1])\n",
" | >>> clustering\n",
" | DBSCAN(eps=3, min_samples=2)\n",
" | \n",
" | See also\n",
" | --------\n",
" | OPTICS\n",
" | A similar clustering at multiple values of eps. Our implementation\n",
" | is optimized for memory usage.\n",
" | \n",
" | Notes\n",
" | -----\n",
" | For an example, see :ref:`examples/cluster/plot_dbscan.py\n",
" | <sphx_glr_auto_examples_cluster_plot_dbscan.py>`.\n",
" | \n",
" | This implementation bulk-computes all neighborhood queries, which increases\n",
" | the memory complexity to O(n.d) where d is the average number of neighbors,\n",
" | while original DBSCAN had memory complexity O(n). It may attract a higher\n",
" | memory complexity when querying these nearest neighborhoods, depending\n",
" | on the ``algorithm``.\n",
" | \n",
" | One way to avoid the query complexity is to pre-compute sparse\n",
" | neighborhoods in chunks using\n",
" | :func:`NearestNeighbors.radius_neighbors_graph\n",
" | <sklearn.neighbors.NearestNeighbors.radius_neighbors_graph>` with\n",
" | ``mode='distance'``, then using ``metric='precomputed'`` here.\n",
" | \n",
" | Another way to reduce memory and computation time is to remove\n",
" | (near-)duplicate points and use ``sample_weight`` instead.\n",
" | \n",
" | :class:`cluster.OPTICS` provides a similar clustering with lower memory\n",
" | usage.\n",
" | \n",
" | References\n",
" | ----------\n",
" | Ester, M., H. P. Kriegel, J. Sander, and X. Xu, \"A Density-Based\n",
" | Algorithm for Discovering Clusters in Large Spatial Databases with Noise\".\n",
" | In: Proceedings of the 2nd International Conference on Knowledge Discovery\n",
" | and Data Mining, Portland, OR, AAAI Press, pp. 226-231. 1996\n",
" | \n",
" | Schubert, E., Sander, J., Ester, M., Kriegel, H. P., & Xu, X. (2017).\n",
" | DBSCAN revisited, revisited: why and how you should (still) use DBSCAN.\n",
" | ACM Transactions on Database Systems (TODS), 42(3), 19.\n",
" | \n",
" | Method resolution order:\n",
" | DBSCAN\n",
" | sklearn.base.ClusterMixin\n",
" | sklearn.base.BaseEstimator\n",
" | builtins.object\n",
" | \n",
" | Methods defined here:\n",
" | \n",
" | __init__(self, eps=0.5, *, min_samples=5, metric='euclidean', metric_params=None, algorithm='auto', leaf_size=30, p=None, n_jobs=None)\n",
" | Initialize self. See help(type(self)) for accurate signature.\n",
" | \n",
" | fit(self, X, y=None, sample_weight=None)\n",
" | Perform DBSCAN clustering from features, or distance matrix.\n",
" | \n",
" | Parameters\n",
" | ----------\n",
" | X : {array-like, sparse matrix} of shape (n_samples, n_features), or (n_samples, n_samples)\n",
" | Training instances to cluster, or distances between instances if\n",
" | ``metric='precomputed'``. If a sparse matrix is provided, it will\n",
" | be converted into a sparse ``csr_matrix``.\n",
" | \n",
" | sample_weight : array-like of shape (n_samples,), default=None\n",
" | Weight of each sample, such that a sample with a weight of at least\n",
" | ``min_samples`` is by itself a core sample; a sample with a\n",
" | negative weight may inhibit its eps-neighbor from being core.\n",
" | Note that weights are absolute, and default to 1.\n",
" | \n",
" | y : Ignored\n",
" | Not used, present here for API consistency by convention.\n",
" | \n",
" | Returns\n",
" | -------\n",
" | self\n",
" | \n",
" | fit_predict(self, X, y=None, sample_weight=None)\n",
" | Perform DBSCAN clustering from features or distance matrix,\n",
" | and return cluster labels.\n",
" | \n",
" | Parameters\n",
" | ----------\n",
" | X : {array-like, sparse matrix} of shape (n_samples, n_features), or (n_samples, n_samples)\n",
" | Training instances to cluster, or distances between instances if\n",
" | ``metric='precomputed'``. If a sparse matrix is provided, it will\n",
" | be converted into a sparse ``csr_matrix``.\n",
" | \n",
" | sample_weight : array-like of shape (n_samples,), default=None\n",
" | Weight of each sample, such that a sample with a weight of at least\n",
" | ``min_samples`` is by itself a core sample; a sample with a\n",
" | negative weight may inhibit its eps-neighbor from being core.\n",
" | Note that weights are absolute, and default to 1.\n",
" | \n",
" | y : Ignored\n",
" | Not used, present here for API consistency by convention.\n",
" | \n",
" | Returns\n",
" | -------\n",
" | labels : ndarray of shape (n_samples,)\n",
" | Cluster labels. Noisy samples are given the label -1.\n",
" | \n",
" | ----------------------------------------------------------------------\n",
" | Data descriptors inherited from sklearn.base.ClusterMixin:\n",
" | \n",
" | __dict__\n",
" | dictionary for instance variables (if defined)\n",
" | \n",
" | __weakref__\n",
" | list of weak references to the object (if defined)\n",
" | \n",
" | ----------------------------------------------------------------------\n",
" | Methods inherited from sklearn.base.BaseEstimator:\n",
" | \n",
" | __getstate__(self)\n",
" | \n",
" | __repr__(self, N_CHAR_MAX=700)\n",
" | Return repr(self).\n",
" | \n",
" | __setstate__(self, state)\n",
" | \n",
" | get_params(self, deep=True)\n",
" | Get parameters for this estimator.\n",
" | \n",
" | Parameters\n",
" | ----------\n",
" | deep : bool, default=True\n",
" | If True, will return the parameters for this estimator and\n",
" | contained subobjects that are estimators.\n",
" | \n",
" | Returns\n",
" | -------\n",
" | params : mapping of string to any\n",
" | Parameter names mapped to their values.\n",
" | \n",
" | set_params(self, **params)\n",
" | Set the parameters of this estimator.\n",
" | \n",
" | The method works on simple estimators as well as on nested objects\n",
" | (such as pipelines). The latter have parameters of the form\n",
" | ``<component>__<parameter>`` so that it's possible to update each\n",
" | component of a nested object.\n",
" | \n",
" | Parameters\n",
" | ----------\n",
" | **params : dict\n",
" | Estimator parameters.\n",
" | \n",
" | Returns\n",
" | -------\n",
" | self : object\n",
" | Estimator instance.\n",
"\n"
]
}
],
"source": [
"help(DBSCAN)"
]
},
{
"cell_type": "code",
"execution_count": 78,
"metadata": {},
"outputs": [],
"source": [
"dbscan = DBSCAN()"
]
},
{
"cell_type": "code",
"execution_count": 79,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAEGCAYAAABsLkJ6AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAABUE0lEQVR4nO3dd3iT19n48e/RHpbkvfcE22yzSQgjAbII2Xs2adN0r7RN2/fXnb7N23SkK4MmTdKsZu8JhL23jRfGe29bW3p+f8gIHBMCASMcnc915brQo0ePbolwbj1n3EcoioIkSZIUflShDkCSJEkKDZkAJEmSwpRMAJIkSWFKJgBJkqQwJROAJElSmNKEOoCTERsbq2RmZoY6DEmSpDFl+/btHYqixH3y+JhKAJmZmWzbti3UYUiSJI0pQojaYx2XXUCSJElhSiYASZKkMCUTgCRJUpgaU2MAx+LxeGhoaMDpdIY6lBNmMBhITU1Fq9WGOhRJksLYmE8ADQ0NWCwWMjMzEUKEOpzPpCgKnZ2dNDQ0kJWVFepwJEkKY2M+ATidzjHT+AMIIYiJiaG9vT3UoUiSdBxdAy4ONPXROeAiJcrEuGQrJv2YbzKH+UJ8mrHS+B821uKVpC+yrgEXTo+PWIsenUYNQJ/Dw5/fLeedPc3B8769bBxXzUhHpfri/Pv9QiQASZKkk9He56SucxCnx8c7u5v5cH8L509I4o7zckiPMVPTNjCs8Qf42/sVzM6LJT3GHKKoTz85C+g0OnDgALNnz0av1/PAAw+EOhxJko6hqqWfu1Zu5p7Ht/Hdp3fi8flZVJTIu3uaeXRVNS6vjwGXd8TrXF4/DpcvBBGPHpkATqPo6Gj+/Oc/873vfS/UoUiSdAxur4/H11bT3H1k1uDqsjay4iMQArbXdNLV7yIz1oxJrx722nFJVpIiDWc65FEVdglg8KWXaZkxi8bUdFpmzGLwpZdP27Xj4+OZPn26nN4pSWepfoeX7TXdI473DLq5emY6SyYmce+zu3hqfQ0PXDeViemRqFWCeQVx/HRFMVaTLgRRj56wGgMYfOllen9wL4rDAYCvsZHeH9wLgPnyFaEMTZKkM8Bq1DIjJ4Z3P9G/nx0fwebqTlaVtgJQ0dLP6rI2/nH7DDxeHw63jwjDF++HXVglgP77fxds/A9THA767/+dTACS9AXW3uektLGXll4Hi4sSUAt4a3czQsCV09NJiTKypqx12Gs0akF5Uy+/f7OMfqeXaLOOX1w5kZLsmBHXb+gcZEdtN1Wt/UzPimFieiS2MXC3EFYJwNfUdFLHT8Rf//pXHnnkEQDeeustkpOTP/e1JEk6OYqicKC5j9KGXnQaFUWpkWTHRww7p2fQzf2vl7K+4sjam68uzuOJr8yka8BDTdsArb1ObpibyX821OLzKwBcPCWFX7+6H5fXD0DXoJv7XtjNgzdMJTs+AoMu0Hzub+hhXXk7bp+fmAg997++n6tmpnPzvOyzfspoWCUAdXIyvsbGYx7/vO655x7uueeeUwlLkqTPUNHcx/v7WqjtGGDJxGRKsqKxmXTsre/hnse34vEFGm2rUcvfbp1ObqIl+NqDbQPDGn+Ax1ZXU5Bk5dtP7QgeS4sxcdXMdP67pY4LJyWTFWumKDWSkuxoPF4/GrWKt3c3sbGqgw1VHVw8KYUeh5uvP7ENuzswO0inUXH3ojz+8WEli4sTSY0+u6eMhlUCsPzw3mFjAADCaMTyw3tPy/VbWlooKSmhr68PlUrFH//4R0pLS7Farafl+pIUjg61D/C1J7bR5/AA8PGBdr6zbByXT0/j6Q2Hgo0/BBZwbahsH5YA7O5jT+msaO4bdqy+087ENBtZcePZUNFBVdsAl01LoaXHiVcIBHD34jzqOwb5x0dVpMeYqGjuDzb+AG6vn9113WTEmvH5j1y7qdtOY7cDi0FDZqw5ePcQamdHFGfI4X7+/vt/h6+pCXVyMpYf3nva+v8TExNpaGg4LdeSJCmgoqU/2Pgf9tiaas4ZF09r78gikO19rmGPM2LNRBg0DDiPJIIJaZEk2IzDzkuKNHKwbZBHV1cHDhyAWIueiyYn88TaGgAyY83cPj8HgC3VnTg9fj6pz+FhcVFicMro3vpuvvv0zuBnuHFuJueOi8fnV8iMMxNl1p/M13Fahd00UPPlK0jcsomUhjoSt2ySg7+SNAYpCqhVgitnpI947pxxw3c+TIsx87PLipmUHonFoGHB+ARm58Wy5WAHmbFHumgWFiXw9IZDw17b0e9Cqz7STB7qGKTH7kYlwO3xU5xqG/H+i4sTOX9iIjqNmj67m9+9XjYsgT21/hDba7q4+19b+f5/dlLfOfh5v4ZTFlZ3AJIknf0GnB6qWvtp63WSGGkkOy5ixC/42+ZnE281MDc/lvsuLWJfQw92t5fzChOZkBZJr91NZ7+LCKOWeKuBVWWtGHUaLpiYxL76XlaVtVKQZOGGuRk8ue4QZr2GSelR/Hdz3WfG12N3YzNqKUi2srqsjS8vzOWdPc34/Ao3z8tiQWECFmNgymifw0tVa/+Ia7iHBpb3NfSytryN6+eMrAzcNeAauvtxkx5jJjfBgkZ9en+zywQgSdJZw+Xx8czGWh473A0DfHdZAX+5uYR39jRT0z7AxVNSmDE0FdPjUxhwedld30NGjJk4i56GLju/eHkflS39REfouG95EYUpNv7vrQPD3mteQTwCwcKiBHLiLeys6eSiKSm8tLU+eI7VqMWnKMNelxkbwY+XF/OX98qp67TT0GXnprmZxETomZUbS4TxyHqBSLOWolQr+xuGjzfoNEca8q0HO0ckgK4BF/e/vp+PDwQGr1UC7r9mMueOT/g8X+unkglAkqSzRm3HICvXVA879sd3K3jy7tl8e9m4Ycd9foXnN9Xx1PpA//yh9kE2V3dy18JcKlsCv7q7Btw8sqqK7ywbx2XTUnl9ZyN+RWFRYSIC+NWr+4PX++01k+hzePnSeTlsqe4kOz6CRUWJrD3QhkYtiDTpuHpmOk9vOMSSiUnMyo3l8ulGQPDc5lpUgMPtJSfBQoRBS1KkkQiDlrsW5PGb1/bT2usMdltVNPcRZ9XT3udiXn78iO+hsqU/2PgD+BX4/ZtlFKbaiLWcvnIUIUsAQog04N9AAqAADyuK8qdQxSNJUuj1OTx84gc3Pr9Cv2PkTJ62PifPb64ddszp8Q3rKoqz6pmXH8+/1x5CrxHcPC8LIQS5CRE8veEQX16Yi9vrJyPWzBs7G1lf0YFJr2Z8so3mXgeDLi+z8mKYnBmFz69Q2dKHz++nqdvBi1vr+O6ycTy5/hBXTE/DZtKyv6mP+98oA+CK6WlcOCmZJ9fVMC8/jgSbgXHJVnrtHjY6PMzJi6M41UZ6jInKln4yY81oh+4Meh3uEZ+3vd+Fw316i9GF8g7AC3xXUZQdQggLsF0I8b6iKKUhjEmSpBBKjjJiNWqHDZrGWvQkRRlHnKtWCQxadbA//TDNUYuvLpiQxH82HgIFrp2dGby7+NmKYopSbfzzoyoKkqxMzoikIMlGfqKVteVtbK/pAmD51FR++Uop3YOBBjktxsQd5+Xw85f2khJlYk99L9fOzmDl6mqumJHOK9uOzAJ8fnMdFoOW0sZettV0IQR8c0kBf363nKG1Zry1q5EvL8rj7x9U8oOLx3PR5BT6nV6y4iKIMKgZcB5p8Ofmx57WX/8QwllAiqI0K4qyY+jP/UAZkBKqeE7VO++8Q0FBAbm5udx///2hDkeSxqTkKBMPXD+FrLjA7Jy8RAu/u3Yy8daRDV+81cCXF+YOO5ZoM5ARa+Lwnks6tQqzXsNlJanE2/TcvSiPjFgzkSYdLwwN+N42P5v/bKhl5ZpqnlxfQ3FaJOeOC3TLuLw+jl7MW99pp7ypD4tBy4DTQ36ihW0Hu0iPNbO/oWdEjLvquslNsJCbYOE7y8bj8vi547xclk9LBQJjGIfaB4mz6Flf0c5L2+q57Z8b+fHzu/nGknHBGU1z82P5+gUFGHXqEe9xKs6KMQAhRCYwBdh8jOfuAu4CSE8fOeXrbODz+bjnnnt4//33SU1
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"display_categories(dbscan,two_blobs)"
]
},
{
"cell_type": "code",
"execution_count": 80,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAEGCAYAAABsLkJ6AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAABV20lEQVR4nO3dd3iT19n48e/RlmxJ3ntPsM02m4SZABkQsvdO2jTdK23ztu+v803btGmbNG0zaGazmr0HAcLe0zZeGO+9h7ae3x8yAseEQMAIo/O5rlxX9OjRo1sCzq3njPsIRVGQJEmSgo8q0AFIkiRJgSETgCRJUpCSCUCSJClIyQQgSZIUpGQCkCRJClKaQAdwMqKiopS0tLRAhyFJkjSq7Nixo01RlOjPHx9VCSAtLY3t27cHOgxJkqRRRQhRfazjsgtIkiQpSMkEIEmSFKRkApAkSQpSo2oM4FhcLhd1dXXY7fZAh3LCDAYDSUlJaLXaQIciSVIQG/UJoK6uDrPZTFpaGkKIQIfzpRRFob29nbq6OtLT0wMdjiRJQWzUJwC73T5qGn8AIQSRkZG0trYGOhRJko6jo8/BgYYe2vscJIabGJNgwaQf9U3mEOfEpxktjf9hoy1eSTqXdfQ5sLs8RJn16DRqAHpsLv72YSkf7G30n/e9pWO4aloKKtW58+/3nEgAkiRJJ6O1x05Nez92l4cP9jSyqqiJC8bFc8e8TFIiQ6hq6RvS+AM8+nEZM7OjSIkMCVDUp5+cBXQaHThwgJkzZ6LX63nwwQcDHY4kScdQ0dTL3Su3cO9T2/nB87twebwszI/jw72NPLG6EofbQ5/DPex1DrcXm8MTgIhHjkwAp1FERAR/+9vf+OEPfxjoUCRJOgan28NT6ypp7Dwya3BNSQvpMaEIATuq2unodZAWFYJJrx7y2jHxFuLDDGc65BEVdAmg/7XXaZo2g/qkFJqmzaD/tddP27VjYmKYOnWqnN4pSWepXpubHVWdw4539Tu5enoKi8fHc9+Lu3luQxUPXjeZ8SlhqFWCObnR/HxFARaTLgBRj5ygGgPof+11un98H4rNBoCnvp7uH98HQMjlKwIZmiRJZ4DFqGVaZiQffq5/PyMmlC2V7awubgagrKmXNSUt/PP2abjcHmxOD6GGc++HXVAlgN4Hfu9v/A9TbDZ6H/i9TACSdA5r7bFTXN9NU7eNRfmxqAW8t6cRIeDKqSkkhhtZW9I85DUataC0oZs/vltCr91NRIiOX105nsKMyGHXr2vvZ2d1JxXNvUxNj2R8ShjWUXC3EFQJwNPQcFLHT8Tf//53Hn/8cQDee+89EhISvvK1JEk6OYqicKCxh+K6bnQaFflJYWTEhA45p6vfyQNvF7Oh7Mjam28syubpr0+no89FVUsfzd12bpidxn82VuPxKgBcMimR375ZhMPtBaCj38n9r+zhoRsmkxETikHnaz6L6rpYX9qK0+MlMlTPA28XcdX0FG6ek3HWTxkNqgSgTkjAU19/zONf1b333su99957KmFJkvQlyhp7+Hh/E9VtfSwen0BhegRWk459tV3c+9Q2XB5fo20xann01qlkxZn9rz3Y0jek8Qd4ck0lufEWvvfcTv+x5EgTV01P4b9ba7hoQgLpUSHkJ4VRmBGBy+1Fo1bx/p4GNlW0sbGijUsmJNJlc/Ktp7cz4PTNDtJpVNyzMJt/ripnUUEcSRFn95TRoEoA5p/cN2QMAEAYjZh/ct9puX5TUxOFhYX09PSgUqn4y1/+QnFxMRaL5bRcX5KC0aHWPr759HZ6bC4APjvQyveXjuHyqck8v/GQv/EH3wKujeWtQxLAgPPYUzrLGnuGHKttH2B8spX06LFsLGujoqWPy6Yk0tRlxy0EArhnUTa1bf3889MKUiJNlDX2+ht/AKfby56aTlKjQvB4j1y7oXOA+k4bZoOGtKgQ/91DoJ0dUZwhh/v5ex/4PZ6GBtQJCZh/ct9p6/+Pi4ujrq7utFxLkiSfsqZef+N/2JNrKzlvTAzN3cOLQLb2OIY8To0KIdSgoc9+JBGMSw4j1moccl58mJGDLf08sabSd+AARJn1XDwxgafXVQGQFhXC7XMzAdha2Y7d5eXzemwuFuXH+aeM7qvt5AfP7/J/hhtnp3H+mBg8XoW06BDCQ/Qn83WcVkE3DTTk8hXEbd1MYl0NcVs3y8FfSRqFFAXUKsGV01KGPXfemKE7HyZHhvCLywqYkBKG2aBh/thYZmZHsfVgG2lRR7poFuTH8vzGQ0Ne29brQKs+0kweauuna8CJSoDT5aUgyTrs/RcVxHHB+Dh0GjU9A05+/3bJkAT23IZD7Kjq4J5/b+NH/9lFbXv/V/0aTllQ3QFIknT267O7qGjupaXbTlyYkYzo0GG/4G+bm0GMxcDsnCjuX5bP/rouBpxu5uXFMS45jO4BJ+29DkKNWmIsBlaXNGPUabhwfDz7a7tZXdJMbryZG2an8uz6Q4ToNUxICee/W2q+NL6uASdWo5bcBAtrSlr42oIsPtjbiMercPOcdObnxWI2+qaM9tjcVDT3DruGc3BgeX9dN+tKW7h+1vDKwB19jsG7HycpkSFkxZrRqE/vb3aZACRJOms4XB5e2FTNk4e7YYAfLM3l4ZsL+WBvI1WtfVwyKZFpg1MxXR6FPoebPbVdpEaGEG3WU9cxwK9e3095Uy8RoTruX55PXqKVP713YMh7zcmNQSBYkB9LZoyZXVXtXDwpkde21frPsRi1eBRlyOvSokL52fICHv6olJr2Aeo6BrhpdhqRoXpmZEURajyyXiAsREt+koWiuqHjDTrNkYZ828H2YQmgo8/BA28X8dkB3+C1SsAD10zk/LGxX+Vr/UIyAUiSdNaobutn5drKIcf+8mEZz94zk+8tHTPkuMer8PLmGp7b4OufP9Taz5bKdu5ekEV5k+9Xd0efk8dXV/D9pWO4bEoSb++qx6soLMyLQwC/ebPIf73/u2YCPTY3d87LZGtlOxkxoSzMj2PdgRY0akGYScfV01N4fuMhFo+PZ0ZWFJdPNQKCl7ZUowJsTjeZsWZCDVriw4yEGrTcPT+b371VRHO33d9tVdbYQ7RFT2uPgzk5McO+h/KmXn/jD+BV4I/vlpCXZCXKfPrKUQQsAQghkoFngFhAAR5TFOWvgYpHkqTA67G5+NwPbjxehV7b8Jk8LT12Xt5SPeSY3eUZ0lUUbdEzJyeGZ9YdQq8R3DwnHSEEWbGhPL/xEF9bkIXT7SU1KoR3dtWzoawNk17N2AQrjd02+h1uZmRHMjEtHI9XobypB4/XS0OnjVe31fCDpWN4dsMhrpiajNWkpaihhwfeKQHgiqnJXDQhgWfXVzEnJ5pYq4ExCRa6B1xssrmYlR1NQZKVlEgT5U29pEWFoB28M+i2OYd93tZeBzbn6S1GF8g7ADfwA0VRdgohzMAOIcTHiqIUBzAmSZICKCHciMWoHTJoGmXWEx9uHHauWiUwaNX+/vTDNEctvrpwXDz/2XQIFLh2Zpr/7uIXKwrIT7Lyr08ryI23MDE1jNx4KzlxFtaVtrCjqgOA5ZOT+PUbxXT2+xrk5EgTd8zL5Jev7SMx3MTe2m6unZnKyjWVXDEthTe2H5kF+PKWGswGLcX13Wyv6kAI+M7iXP72YSmDa814b3c9X1uYzT8+KefHl4zl4omJ9NrdpEeHEmpQ02c/0uDPzok6rb/+IYCzgBRFaVQUZefg//cCJUBioOI5VR988AG5ublkZWXxwAMPBDocSRqVEsJNPHj9JNKjfbNzsuPM/P7aicRYhjd8MRYDX1uQNeRYnNVAapSJw3su6dQqQvQaLitMIsaq556F2aRGhRBm0vHK4IDvbXMz+M/GalaureTZDVUUJIdx/hhft4zD7eHoxby17QOUNvRgNmjps7vIiTOz/WAHKVEhFNV1DYtxd00nWbFmsmLNfH/pWBwuL3fMy2L5lCTAN4ZxqLWfaLOeDWWtvLa9ltv+tYmfvbyHby8e45/RNDsnim9dmItRpx72HqfirBgDEEKkAZOALcd47m7gboCUlOFTvs4GHo+He++9l48//pi
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"display_categories(dbscan,two_blobs_outliers)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Epsilon\n",
"\n",
" eps : float, default=0.5\n",
" | The maximum distance between two samples for one to be considered\n",
" | as in the neighborhood of the other. This is not a maximum bound\n",
" | on the distances of points within a cluster. This is the most\n",
" | important DBSCAN parameter to choose appropriately for your data set\n",
" | and distance function."
]
},
{
"cell_type": "code",
"execution_count": 81,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAEGCAYAAABsLkJ6AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAABSzElEQVR4nO3dd5gT1RrA4d+ZSd0kW1l6bwKK9CYiIliwYUHsXRFFRVQU7L1LURH12nsFCyoWVOwCIr1I70jZlmTTc+4fWSLLAoKyhCXf+zw8l5zMTL6Ml/NlTlVaa4QQQqQfI9UBCCGESA1JAEIIkaYkAQghRJqSBCCEEGlKEoAQQqQpS6oD2BPVqlXTDRs2THUYQghRpfz++++btdb525dXqQTQsGFDpk+fnuowhBCiSlFKrdxRuTQBCSFEmpIEIIQQaUoSgBBCpKkq1QewI5FIhDVr1hAMBlMdym5zOBzUrVsXq9Wa6lCEEGmsyieANWvW4PF4aNiwIUqpVIfzj7TWbNmyhTVr1tCoUaNUhyOESGNVPgEEg8EqU/kDKKXIy8tj06ZNqQ5FCLELsc2bicyeTWzjJiwN6mNt3RrD7U51WHtVlU8AQJWp/LeqavEKcSCLbd6MLi3FrFEDZbcnyoqLKb7nXgIfjE8el3XP3bguvghlHDhdpwfONxFCiN0U27CB4M+/EJg8maI77uSv7j0ovGEYkWXLAYgu+rNc5Q9Q/MCDRFesSEG0lUcSwF60cOFCunXrht1u57HHHkt1OEKIHYgsWMCmU05jyxkDKLjgIgiHcZ50EoEJEygZORIdChH3llQ8MRhE+0v3ebyVSRLAXpSbm8sTTzzBjTfemOpQhBA7oEMhSp54ktjq1cmy4OeTsDRvBkoR/ulnohs3YmnWDLVde7/10NZY6tXd1yFXqrRLAP7xE9jQuStr69ZnQ+eu+MdP2GvXrl69Op06dZLhnULsp2LFxYR/+rlCeXzLFlyXXYrztFMpvPRyfE+PI++lF7F27gSmif3oo8kZNQojO3vfB12JDohO4N3lHz+B4ptuRgcCAMTWrqX4ppsBcJ12aipDE0LsA2Z2NvYjjiAwofwPP0uLgwhN+Z7gp58BEJk3j+Bnn1Ptw/HoUAhdWorK9KQi5EqVVgnA+9DDycp/Kx0I4H3oYUkAQhzAouvXE5k5i+i6tThPPgksJoH33gelyLjoIsz6DQh+Pqn8STYrkVmzKbr1NnRxMUa1auSMfQrH4d0rXD+yYiWhX34mumAh9u7dsXXuhJmTs4++3b+XVk1AsXXr9qh8d4wdO5a2bdvStm1b1v2H6wgh9pzWmvDs2fheeRX/2+8QWfRnhWNiBQUUDb+Fgssup+SOuyi4+BKszZpR7YtJ5L36CpZ6dYmvW4f7ykFgmsnzXGeeSeGNw9DFxQDEN2+mcNCVhGbOJL7ND8nQjD8offddYkuWYubnUzR8BP7X30DH45V/A/6jtHoCMGvXJrZ27Q7L/63BgwczePDg/xKWEOIfhOfNI/DRx0SXLMF56qnYD++OmZNDePp0Ng84C8JhAFR2NtXeexdbq5bJc6OLFhH6+uty1ysZOYq8Qw5hy/kXJMvMxo1wXXIx/pdfwdn/dCxNm2Fr3w579+6J61utlL7/AaFvviX0zbc4B5yBLihky1lno/3+xEXsdrKG30zxw4+QcdJJWBo2qPyb8x+kVQLwDL+5XB8AgHI68Qy/ea9cf8OGDXTs2JGSkhIMw2D06NHMnz+fzMzMvXJ9IdJRZPFiNg84C11UBEDwiy/JuvduXBdcgO/Z55KVP4AuKiL0zTflEkB8a+W8rWCQ8Ly55Ypiy5ZjG94Ra7NmBL/5hsiiBbjOOYfo2rUQiYCCzBE3E122HO/Dj2A2bkR03vy/K3+AUIjQ1KlYmjRBx6LJ4ujq1URXrsTIysLStCmG07l3bs5/lFYJYGs7v/ehh4mtW4dZuzae4Tfvtfb/mjVrsmbNmr1yLSFEQmTe/GTlv5V35Gjsxx5LbG3FZtfYhg3lXluaNEFlZqJL/h7bb+vYAbN2nXLHmfXqEVn0J77HRyYKJn2BUbMGGWecge/JpxLXatoUz9AhAIS+/wF2sAhlvKgIZ7+TsNRNDBkNTf+dgosuIl6Y+A7uq67EcewxEI1hadYUMy9v92/GXpZWfQCQSAI1p/5KnTWrqDn1V+n8FWK/pyuWaI1hmrguvqjCe45jji732tqoETmjR2Lr1AmVlYXj+L7Ye/Ui/MMPWJo2TR7nPPEE/OOeKXdufMNfqG2GdUeXLCG+pQAMA0IhrO3aVfh8Z7+TyejXD2W3Ey8qomjELcnKH8D39DhCP/3M5tP7s+Wii4ksX7Gb92HvS6snACHE/i/u9RKZP5/Y+vWYdepgOeigCr/gPdcNwaxZE0fvo8h+/DHC038nXurH2bcvto4diRcWEvtrI0ZWFmatmgQ++xzlcpFxyimEZ8wg+NnnWA85hIwrB+Ef+zSGx4Otcyd8L738z/EVFGDk5GBt3Zrg55Pw3DSMwAfj0bEYnquvwnH88ZhZWQDEioqIzp9f8SKhEACRGX8Q/PILrFdcUeGQ2KZNRObNJ15UiKVxE6wtW5RLRnvDAZEAtNZVaoE1rSv+ohFCQDwYxPfc//COHJUsy7zvHqq98xalH4wnungxGQMGYD+iBwA6HCZeUkJ4+nQsTRpj1qxJdMUKCq+7nui8eRj5+WQ//ii2tm0ovu0OQtt8lv2YozEMA+dJJ2I96CCCP/9KxoAzKH31teQxRk42bDeax9KsGdmPPULxvfcTW7aM6MoVeK66ClWtGo6jemF6/p4vYOTlYW3Xjsgff5T/omWLzgGEfvwJz3YJILZpE4U3Dyf0xZdlFzLIff5/OI895l/c1Z2r8gnA4XCwZcsW8vLyqkQS2LofgMPhSHUoQux3okuW4B01ulxZyV33UP2LSWTffVe5ch2L4X/xJXxPj0ucu3gxoSnf47lpGNF58wCIb9qE9/GRZN59NxnnnUvp2+9ALIbzpBNRSlE09Prk9XKef454YSGeG64nNOV7LC0OwnnSiQS++BKsVozcXNyXXoLvmWdxnnoKjiN7Yl5wHqDwPf8CGAbxQABbi4MwPJlY6tfD9HjIHHYjRTcOSww3t1hwXXQhkXnzMGrVJL5+A44+vSvch8j8+X9X/gDxOEW33IqtXVvM6tX3yr2GFCYApVQ94FWgBolGvue01mP29Dp169ZlzZo1VWp9/a07ggkhyosXF8P2T8jRKPGS4grHxtavx/fCi+XKdCCQHLcPYNSqiaN3b/xjx4LdgWfwVaAUllat8D3zDJ6bhkEohNm0KaVvv0Po68kotxtrm0OJrVlD3OvD0asX9i5dIBYjMm8eOholtmoV/ldeJeu+e/CNHYfrwgswcnMJ//EHJTcPB8B14QVk9O+Pd+zTOI7ug1m7NtZDDyVeWEjo229x9OqFrUMHjCaNicxfgKVpE5TNlrgP2/QZJO/Nhg3E/X7MCu/8e6l8AogCN2itZyilPMDvSqmvtNY7aDDbOavVKjtrCXGAsNSrh5GTXa4CNGrUwFKvfoVjlcWCkeEkHgqVf2ObdvKMfv3wPfc/0Br35ZfhHZ34jZk9ZhS2du3wPvIo1tat8XTpgu2QQ7AdfDCBL79MrheUcfbZFF03lPiWLUBiroDn+qEUXTMEs0EDwtN+xz3wckpGjcZ14QWUvv5G8rP9L7yIkZlJZOZMwj/9BEqRddcdFN99b7JZqfT9D8i8aRgFDz5E9kMPknFGf+IlJYnF6Lbr97D36Y1Zo8Z/u8HbSdkoIK31eq31jLK/e4EFQJ1dnyWEOJBZ6tcn96WXsDRrlnh98MHkvvA/zFo1Kxxr1qyJZ9iw8mV16mBp2gTKmoOV3Y7yuHGddy5GnTp4ht+MpUkTzNw8/C++BIBnyLX4nnkW7+gxeJ8eh61DBxzHHguADgbLzQ6OLVtOdPYcVFYWuqQEyyEHE/zxRyxNGlds5wdCU6dibdkSS6uWZN17D/FAEM/1Q8k45+zEAeEw0cVLMGvWJDh5Mr7XXmdT3xMoHDSIrDtux17W5m/v05us227DyMj4j3e
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# Tiny Epsilon --> Tiny Max Distance --> Everything is an outlier (class=-1)\n",
"dbscan = DBSCAN(eps=0.001)\n",
"display_categories(dbscan,two_blobs_outliers)"
]
},
{
"cell_type": "code",
"execution_count": 82,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAEGCAYAAABsLkJ6AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAABTDElEQVR4nO3dd5gTVffA8e+dSd0kW1l6b1JEelFERLCABQtiFytiRVQUrD87NhAVUV97r2BBRQUVC0oR6YggvUnZlmTTc39/ZAksCwjCEjDn8zw8L7mZmZyML3Myd+49V2mtEUIIkX6MVAcghBAiNSQBCCFEmpIEIIQQaUoSgBBCpClJAEIIkaYsqQ5gb1SpUkXXr18/1WEIIcQh5bffftustc7fsf2QSgD169dn5syZqQ5DCCEOKUqplTtrly4gIYRIU5IAhBAiTUkCEEKINHVIPQPYmUgkwpo1awgGg6kOZZccDge1a9fGarWmOhQhhEg65BPAmjVr8Hg81K9fH6VUqsOpQGvNli1bWLNmDQ0aNEh1OEIIkXTIJ4BgMHjQXvwBlFLk5eWxadOmVIcihNgLsc2bicydS2zjJiz16mJt1QrD7U51WPvVIZ8AgIP24r/VwR6fEOkstnkzurQUs1o1lN2eaCsupvi++wl8NC65XdZ99+K69BKU8d95dPrf+SZCCLGHYhs2EJz6C4HJkym6+x7+7tqNwpuHElm2HIDo4j/LXfwBih96mOiKFSmItvJIAthPJk6cyGGHHUbjxo0ZMWJEqsMRQuxCZNEiNp1+JlvO7k/BxZdAOIzz1FMJjB9PyciR6FCIuLek4o7BINpfesDjrUySAPaDWCzGtddey5dffsnChQt55513WLhwYarDEkLsQIdClDz1NLHVq5NtwS8nYmnaBJQi/PNUohs3YmnSBLVDf7/1iFZY6tQ+0CFXqrRLAP5x49nQqQtra9dlQ6cu+MeN3+djTp8+ncaNG9OwYUNsNhvnnnsun3zyyX6IVgixP8WKiwn/PLVCe3zLFlxXXI7zzDMovPxKfM+OJe+Vl7F26gimif3448kZNQojO/vAB12J/hMPgfeUf9x4im+9DR0IABBbu5biW28DwHXmGf/6uGvXrqVOnTrJ17Vr12batGn7FqwQYr8zs7OxH3MMgfHlf/hZmh1GaMoPBD//AoDIggUEv/iSKh+PQ4dC6NJSVKYnFSFXqrRKAN4RjyQv/lvpQADviEf2KQEIIQ5u0fXricyeQ3TdWpynnQoWk8AHH4JSZFxyCWbdegS/nFh+J5uVyJy5FN1xJ7q4GKNKFXLGPIPj6K4Vjh9ZsZLQL1OJLvoDe9eu2Dp1xMzJOUDf7t9LqwQQW7dur9r3VK1atVi9XZ/imjVrqFWr1j4dUwjxz7TWRObNI/z7bJTdjq1tW6yHNS23TayggKJhtxOaNCnZlnn7cFxfTURv3EhkyRLi69bhvnoQvueeh1gMANc551B4y1AoqzIQ37yZwkFXk/vm61gPOwzD6QQgNOt3gpMmQSiEmZ9P0bDhuC67FM+11xz0Q0bTKgGYNWsSW7t2p+37omPHjixZsoTly5dTq1Yt3n33Xd5+++19OqYQYpvwggUEPvmU6NKlOM84A/vRXTFzcgjPnMnm/udCOAyAys6mygfvY2vRPLlvdPHichd/gJKRo8g7/HC2XHRxss1s2ADXZZfif/U1nP3OwtK4CbZ2bbF37Zo4vtVK6YcfEfr2O0Lffoez/9nogkK2nHse2u9PHMRuJ2vYbRQ/8igZp56KpX69yj85++DgTk/7mWfYbaiyrL2VcjrxDLttn45rsVh45plnOPHEE2nevDn9+/enZcuW+3RMIURCZMkSNvc/F9+YZwl+9TWFg64mMH48OhrF9/wLyYs/gC4qIvTtt+X2j2+9OG8vGCS8YH65ptiy5dg6diD7wQfQhYVEFi/Cdf75YLGAUqAgc/htYLHgfWIk4ZkzCXz22baLP0AoRGj6dCyNGqFj0WRzdPVqgj/9RHjePOI7dEOnUlrdAWzt5/eOeITYunWYNWviGXbbfun/79OnD3369Nnn4wghyossWIguKirX5h35JPYTTyS2tmL3bWzDhnKvLY0aoTIz0SXbxvbbOrTHrFm+m9asU4fI4j/xPTEy0TDxK4zq1cg4+2x8Tz+TOFbjxniGDAYg9MOPye6h7cWLinD2PRVL7cSQ0dDM3yi45BLihYnv4L7mahwnngDRGJYmjTHz8vb8ZOxnaXUHAIkkUH36r9Ras4rq03+Vh79CHPR0xRatMUwT16WXVHjPccLx5V5bGzQg58mR2Dp2RGVl4ejTG3uPHoR//BFL48bJ7ZynnIx/7HPl9o1v+Bu1XRXf6NKlxLcUgGFAKIS1bdsKn+/sexoZffui7HbiRUUUDb89efEH8D07ltDPU9l8Vj+2XHIpkeUr9vA87H9pdQcghDj4xb1eIgsXElu/HrNWLSyHHVbhF7znxsGY1avj6Hkc2U88Tnjmb8RL/Th798bWoQPxwkJif2/EyMrCrFGdwBdfolwuMk4/nfCsWQS/+BLr4YeTcfUg/GOexfB4sHXqiO+VV/85voICjJwcrK1aEfxyIp5bhxL4aBw6FsNz3TU4+vTBzMoCIFZURHRnk0JDIQAis34n+PVXWK+6qsImsU2biCxYSLyoEEvDRlibNyuXjPaH/0QC0Fof1AXXtK74C0YIUVE8GMT3wv/wjhyVbMt84D6qvPcOpR+NI7pkCRn9+2M/phsAOhwmXlJCeOZMLI0aYlavTnTFCgpvvInoggUY+flkP/EYtjatKb7zbkLbfZb9hOMxDAPnqadgPewwglN/JaP/2ZS+/kZyGyMnG+LxcjFamjQh+/FHKb7/QWLLlhFduQLPNdegqlTBcVwPTM+2+QJGXh7Wtm2J/P57+S9aVnQOIPTTz3h2SACxTZsovG0Yoa++LjuQQe6L/8N54gn/4qzu2iGfABwOB1u2bCEvL++gTAJb1wNwOBypDkWIg1506VK8o54s11byf/dR9auJZN/7f+XadSyG/+VX8D07NrHvkiWEpvyA59ahRBcsACC+aRPeJ0aSee+9ZFx4AaXvvgexGM5TT0EpRdGQm5LHy3nxBeKFhXhuvonQlB+wNDsM56mnEPjqa7BaMXJzcV9+Gb7nnsd5xuk4ju2OefGFgML34ktgGMQDAWzNDsPwZGKpWwfT4yFz6C0U3TI0MdzcYsF1yQAiCxZg1KhOfP0GHL16VjgPkYULt138AeJxim6/A1vbNphVq+6Xcw0pTABKqTrA60A1Ep18L2itR+/tcWrXrs2aNWsO6nr7W1cEE0LsXry4GHa8Y45GiZcUV9g2tn49vpdeLtemAwF08bZtjRrVcfTsiX/MGLA78Fx7DSiFpUULfM89h+fWoYnx+40bU/rue4QmTUa53VhbH0FszRriXh+OHj2wd+4MsRiRBQvQ0SixVavwv/Y6WQ/ch2/MWFwDLsbIzSX8+++U3DYMANeAi8no1w/vmGdxHN8Ls2ZNrEccQbywkNB33+Ho0QNb+/YYjRoSWbgIS+NGKJstcR62e2aQPDcbNhD3+zH38RxvL5V3AFHgZq31LKWUB/hNKfWN1nqvqqhZrVZZaUuI/whLnToYOdnlLoBGtWpY6tStsK2yWDAynMRDofJvbNdPntG3L74X/gda477yCrxPJn5jZo8eha1tW7yPPoa1VSs8nTtjO/xwbC1bEvj662S9oIzzzqPoxiHEt2wBEnMFPDcNoej6wZj16hGe8RvugVdSMupJXAMupvTNt5Kf7X/pZYzMTCKzZxP++WdQiqz/u5vie+9PdiuVfvgRmbcOpeDhEWSPeJiMs/sRLylJFKPb4bmHvVdPzGrV9u0E7yBlo4C01uu11rPK/u4FFgEyfVaINGapW5fcV17B0qRJ4nXLluS+9D/MGtUrbGtWr45n6NDybbVqYWncKDFuH1B2O8rjxnXhBRi1auEZdhuWRo0wc/Pwv/wKAJ7BN+B77nm8T47G++xYbO3b4zjxRAB0MAjmtt/csWXLic6dh8rKQpeUYDm8JcGffsLSqGHFfn4gNH061ub
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# Huge Epsilon --> Huge Max Distance --> Everything is in the same cluster (class=0)\n",
"dbscan = DBSCAN(eps=10)\n",
"display_categories(dbscan,two_blobs_outliers)"
]
},
{
"cell_type": "code",
"execution_count": 166,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAEGCAYAAABsLkJ6AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAABV0klEQVR4nO3dd5hU1f348feZXnZmtvdeYXfpS0epCtgQe+8ajekmMYn5Jj9TNTExxZhEI9GoscXeKyBFet9dtrO99zb9/v6YZWBdRBCWYZnzeh6eh7l7585nBvZ85p7yOUJRFCRJkqTgowp0AJIkSVJgyAQgSZIUpGQCkCRJClIyAUiSJAUpmQAkSZKClCbQARyPyMhIJTU1NdBhSJIkjSnbt29vUxQl6vPHx1QCSE1NZdu2bYEOQ5IkaUwRQlQf6bjsApIkSQpSMgFIkiQFKZkAJEmSgtSYGgM4EpfLRV1dHXa7PdChHDODwUBiYiJarTbQoUiSFMTGfAKoq6vDYrGQmpqKECLQ4XwpRVFob2+nrq6OtLS0QIcjSVIQG/MJwG63j5nGH0AIQUREBK2trYEORZKko+joc7C/oYf2PgcJYSbGxVsx6cd8kznMGfFuxkrjf9BYi1eSzmQdfQ7sLg+RFj06jRqAnkEXf3m/hPf2NPrP++7ycVw+IxmV6sz5/T0jEoAkSdLxaO2xU9Pej93l4b3djXxc2MQ5E+K4dUEGyRFmqlr6hjX+AI9+WMrsrEiSI8wBivrkk7OATqL9+/cze/Zs9Ho9Dz30UKDDkSTpCMqberlj1WbufnIb9zy7E5fHy+K8WN7f08i/VlfgcHvoc7hHPM/h9jLo8AQg4tEjE8BJFB4ezl/+8he+//3vBzoUSZKOwOn28OS6Cho7D80aXFPcQlp0CELA9qp2OnodpEaaMenVw547Ls5KXKjhVIc8qoIuAfS/8ipNM2ZRn5hM04xZ9L/y6km7dnR0NNOnT5fTOyXpNNU76GZ7VeeI4139Tq6YmczSiXHc+/wuntlQxUNXT2VicihqlWBeThT/tzIfq0kXgKhHT1CNAfS/8irdP7wXZXAQAE99Pd0/vBcA8yUrAxmaJEmngNWoZUZGBO9/rn8/PTqEzRXtrC5qBqC0qZc1xS3845YZuNweBp0eQgxn3he7oEoAvQ886G/8D1IGB+l94EGZACTpDNbaY6eovpum7kGW5MWgFvDO7kaEgMumJ5MQZmRtcfOw52jUgpKGbn7/djG9djfhZh2/uGwiBekRI65f197PjupOypt7mZ4WwcTkUGxj4G4hqBKAp6HhuI4fi7/97W88/vjjALzzzjvEx8d/5WtJknR8FEVhf2MPRXXd6DQq8hJDSY8OGXZOV7+TB94sYkPpobU3X1+SxVN3zqSjz0VVSx/N3XaunZvKfzdW4/EqAFwwJYFfv16Iw+0FoKPfyX0v7ebha6eSHh2CQedrPgvrulhf0orT4yUiRM8DbxZy+cxkbpiXftpPGQ2qBKCOj8dTX3/E41/V3Xffzd13330iYUmS9CVKG3v4cF8T1W19LJ0YT0FaODaTjr21Xdz95FZcHl+jbTVqefSm6WTGWvzPrWzpG9b4AzyxpoKcOCvffWaH/1hShInLZybzvy01nDcpnrRIM3mJoRSkh+Nye9GoVby7u4HPytvYWN7GBZMS6Bp08s2ntjHg9M0O0mlU3LU4i398XMaS/FgSw0/vKaNBlQAsP7p32BgAgDAasfzo3pNy/aamJgoKCujp6UGlUvGnP/2JoqIirFbrSbm+JAWjA619fOOpbfQMugD4dH8r31s+jkumJ/HsxgP+xh98C7g2lrUOSwADziNP6Sxt7Bl2rLZ9gIlJNtKixrOxtI3ylj4unpZAU5cdtxAI4K4lWdS29fOPT8pJjjBR2tjrb/wBnG4vu2s6SYk04/EeunZD5wD1nYNYDBpSI83+u4dAOz2iOEUO9vP3PvAgnoYG1PHxWH5070nr/4+NjaWuru6kXEuSJJ/Spl5/43/QE2srOGtcNM3dI4tAtvY4hj1OiTQTYtDQZz+UCCYkhRJjMw47Ly7USGVLP/9aU+E7sB8iLXrOnxzPU+uqAEiNNHPL/AwAtlS0Y3d5+byeQRdL8mL9U0b31nZyz7M7/e/hurmpnD0uGo9XITXKTJhZfzwfx0kVdNNAzZesJHbLJhLqaojdskkO/krSGKQooFYJLpuRPOJnZ40bvvNhUoSZn12cz6TkUCwGDQvHxzA7K5ItlW2kRh7qolmUF8OzGw8Me25brwOt+lAzeaCtn64BJyoBTpeX/ETbiNdfkh/LORNj0WnU9Aw4efDN4mEJ7JkNB9he1cFd/97KD/67k9r2/q/6MZywoLoDkCTp9Ndnd1He3EtLt53YUCPpUSEjvsHfPD+daKuBudmR3HdRHvvquhhwulmQG8uEpFC6B5y09zoIMWqJthpYXdyMUafh3Ilx7KvtZnVxMzlxFq6dm8LT6w9g1muYlBzG/zbXfGl8XQNObEYtOfFW1hS38LVFmby3pxGPV+GGeWkszI3BYvRNGe0ZdFPe3DviGs6hgeV9dd2sK2nhmjkjKwN39DmG7n6cJEeYyYyxoFGf3O/sMgFIknTacLg8PPdZNU8c7IYB7lmew19vKOC9PY1UtfZxwZQEZgxNxXR5FPocbnbXdpESYSbKoqeuY4BfvLqPsqZewkN03Lcij9wEG394Z/+w15qXE41AsCgvhoxoCzur2jl/SgKvbK31n2M1avEoyrDnpUaG8JMV+fz1gxJq2geo6xjg+rmpRITomZUZSYjx0HqBULOWvEQrhXXDxxt0mkMN+dbK9hEJoKPPwQNvFvLpft/gtUrAA1dO5uzxMV/lY/1CMgFIknTaqG7rZ9XaimHH/vR+KU/fNZvvLh837LjHq/Diphqe2eDrnz/Q2s/minbuWJRJWZPvW3dHn5PHV5fzveXjuHhaIm/urMerKCzOjUUAv3q90H+93145iZ5BN7ctyGBLRTvp0SEszotl3f4WNGpBqEnHFTOTeXbjAZZOjGNWZiSXTDcCghc2V6MCBp1uMmIshBi0xIUaCTFouWNhFr95o5Dmbru/26q0sYcoq57WHgfzsqNHfA5lTb3+xh/Aq8Dv3y4mN9FGpOXklaMIWAIQQiQB/wFiAAV4TFGUPwcqHkmSAq9n0MXnvnDj8Sr0Do6cydPSY+fFzdXDjtldnmFdRVFWPfOyo/nPugPoNYIb5qUhhCAzJoRnNx7ga4sycbq9pESaeWtnPRtK2zDp1YyPt9HYPUi/w82srAgmp4bh8SqUNfXg8Xpp6Bzk5a013LN8HE9vOMCl05OwmbQUNvTwwFvFAFw6PYnzJsXz9Poq5mVHEWMzMC7eSveAi88GXczJiiI/0UZyhImypl5SI81oh+4MugedI95va6+DQefJLUYXyDsAN3CPoig7hBAWYLsQ4kNFUYoCGJMkSQEUH2bEatQOGzSNtOiJCzOOOFetEhi0an9/+kGawxZfnTshjv9+dgAUuGp2qv/u4mcr88lLtPHPT8rJibMyOSWUnDgb2bFW1pW0sL2qA4AVUxP55WtFdPb7GuSkCBO3Lsjg/lf2khBmYk9tN1fNTmHVmgounZHMa9sOzQJ8cXMNFoOWovputlV1IAR8e2kOf3m/hKG1Zryzq56vLc7i7x+V8cMLxnP+5AR67W7SokIIMajpsx9q8OdmR57Ub/8QwFlAiqI0KoqyY+jvvUAxkBCoeE7Ue++9R05ODpmZmTzwwAOBDkeSxqT4MBMPXTOFtCjf7JysWAsPXjWZaOvIhi/aauBrizKHHYu1GUiJNHFwzyWdWoVZr+HigkSibXruWpxFSqSZUJOOl4YGfG+en85/N1azam0FT2+oIj8plLPH+bplHG4Phy/mrW0foKShB4tBS5/dRXashW2VHSRHmims6xoR466aTjJjLGTGWPje8vE4XF5uXZDJimmJgG8M40BrP1EWPRtKW3llWy03//MzfvLibr61dJx/RtPc7Ei+eW4ORp16xGuciNNiDEAIkQpMATYf4Wd3AHcAJCePnPJ1OvB4PNx99918+OGHJCY
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# How to find a good epsilon?\n",
"dbscan = DBSCAN(eps=1)\n",
"display_categories(dbscan,two_blobs_outliers)"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 0, 1, 0, ..., -1, -1, -1], dtype=int64)"
]
},
"execution_count": 51,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dbscan.labels_"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([False, False, False, ..., True, True, True])"
]
},
"execution_count": 52,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dbscan.labels_ == -1"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3"
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.sum(dbscan.labels_ == -1)"
]
},
{
"cell_type": "code",
"execution_count": 57,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.29910269192422734"
]
},
"execution_count": 57,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"100 * np.sum(dbscan.labels_ == -1) / len(dbscan.labels_)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Charting reasonable Epsilon values"
]
},
{
"cell_type": "code",
"execution_count": 159,
"metadata": {},
"outputs": [],
"source": [
"# bend the knee! https://raghavan.usc.edu/papers/kneedle-simplex11.pdf"
]
},
{
"cell_type": "code",
"execution_count": 170,
"metadata": {},
"outputs": [],
"source": [
"# np.arange(start=0.01,stop=10,step=0.01)"
]
},
{
"cell_type": "code",
"execution_count": 189,
"metadata": {},
"outputs": [],
"source": [
"outlier_percent = []\n",
"number_of_outliers = []\n",
"\n",
"for eps in np.linspace(0.001,10,100):\n",
" \n",
" # Create Model\n",
" dbscan = DBSCAN(eps=eps)\n",
" dbscan.fit(two_blobs_outliers)\n",
" \n",
" # Log Number of Outliers\n",
" number_of_outliers.append(np.sum(dbscan.labels_ == -1))\n",
" \n",
" # Log percentage of points that are outliers\n",
" perc_outliers = 100 * np.sum(dbscan.labels_ == -1) / len(dbscan.labels_)\n",
" \n",
" outlier_percent.append(perc_outliers)\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 190,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Text(0.5, 0, 'Epsilon Value')"
]
},
"execution_count": 190,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAEGCAYAAACKB4k+AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAg80lEQVR4nO3de5hcVZnv8e+vLk065EYgw0BIJEJGxQuCDYJ4HAUcQa56FEFRRJTjGQa8MCg4Ksdz9BxFR0VGGaMoURFBREFEAQPCqCPSAQS5SU4QSQgQlFzk1rd3/ti7qqs73ZWd7t670l2/z/P0U1WrLuutoP32etdeaykiMDMzAyi1OgAzM9t6OCmYmVmdk4KZmdU5KZiZWZ2TgpmZ1VVaHcB47LDDDrHrrru2Ogwzs0ll+fLlj0fEvJGem9RJYdddd6W7u7vVYZiZTSqSHhztOZePzMyszknBzMzqnBTMzKzOScHMzOo2mxQkvVnSzPT+RyVdLmnv/EMzM7OiZRkpfCwiNkp6JXAwcAFw/ubeJOkbkh6T9PuGtrmSrpN0f3q7XdouSV+StELSHU46ZmatkSUp9Ke3hwFLIuInQEeG910IHDKs7UxgWUQsBpaljwEOBRanPyeTIemYmdnEy5IUVkv6KvAW4GpJ22R5X0TcBPxlWPNRwNL0/lLg6Ib2b0XiN8AcSTtliG1MbvnjX/jXa++jt38gry7MzCalLEnhGOAa4HURsQ6YC5wxxv52jIg16f1HgB3T+/OBhxpetypt24SkkyV1S+peu3btmIK47U9PcN71K+jpc1IwM2vUdEWzpDJwa0Q8v9aW/lJfM/q7somIkLTFJ/xExBJgCUBXV9eYTgiqlJJc6JGCmdlQTUcKEdEP3Cdp4QT192itLJTePpa2rwYWNLxul7QtF9VK8rV7nBTMzIbIUj7aDrhL0jJJV9Z+xtjflcAJ6f0TgCsa2t+RXoW0H7C+ocw04TrKAqCv30eRmpk1yrIh3sfG8sGSLgZeDewgaRVwNvBp4FJJJwEPksxXAFwNvB5YATwFnDiWPrNy+cjMbGSbTQoRcaOk5wCLI+LnkqYD5QzvO26Upw4a4bUBnLK5z5wotfJRr0cKZmZDZFnR/B7gMuCradN84Ec5xpS7aikpH3mkYGY2VJY5hVOAA4ANABFxP/A3eQaVt2rZ5SMzs5FkSQrPRkRP7YGkCjCp6y4uH5mZjSxLUrhR0keATkmvBb4P/DjfsPLl8pGZ2ciyJIUzgbXAncD/ILlS6KN5BpW32kjBl6SamQ2V5eqjAeBr6c+UUPFIwcxsRKMmBUmXRsQxku5khDmEiHhJrpHlqDbR7BXNZmZDNRspvC+9PbyIQIrU4fKRmdmIRk0KtW0mIuLB4sIphstHZmYja1Y+2khSNhJDy0ciWYQ8K+fYcuN1CmZmI2s2UphZZCBFGkwKLh+ZmTXKss3Ft7O0TSbVsstHZmYjybJO4YWND9IVzS/LJ5xiDK5odlIwM2s0alKQdFY6r/ASSRvSn43AowyegzApVUsuH5mZjWTUpBAR/y+dV/hsRMxKf2ZGxPYRcVaBMU64av2QHY8UzMwaZTlk56eSXjW8MSJuyiGeQpR9SaqZ2YiyJIUzGu5PA/YFlgMH5hJRASTRUS7R4/KRmdkQWfY+OqLxsaQFwBfzCqgo1bJcPjIzGybL1UfDrQJeMNGBFK1SLrl8ZGY2zGZHCpLOY3BFcwl4KXBrjjEVolou0Tvg8pGZWaMscwrdDff7gIsj4lc5xVOYaln09nmkYGbWKEtSuATYPb2/IiKeyTGewlRdPjIz20SzxWsVSeeQzCEsBb4FPCTpHEnVogLMS7Usl4/MzIZpNtH8WWAusCgiXhYRewO7AXOAzxUQW66q5ZLLR2ZmwzRLCocD74mIjbWGiNgA/E/g9XkHlrdquUSfRwpmZkM0SwoRESMdw9nPCMdzTjaVsjynYGY2TLOkcLekdwxvlHQ8cG9+IRWjWi7R4/KRmdkQza4+OgW4XNK7SLa1AOgCOoE35B1Y3jrKJZ7u7W91GGZmW5VmJ6+tBl4u6UAGz1S4OiKWFRJZzipl0fuMRwpmZo2y7H10PXB9AbEUKlmnMOmnRszMJtRY9j6aEqqeaDYz20RLkoKkD0i6S9LvJV0saZqkRZJulrRC0iWSOvKMwSuazcw2tdmkIGlbSaX0/t9JOnI8K5olzQdOA7oi4kVAGTgW+AzwhYjYHXgCOGmsfWRRLZfoc/nIzGyILCOFm4Bp6S/za4G3AxeOs98K0CmpAkwH1pAc2nNZ+vxS4Ohx9tFUtSx6PFIwMxsiS1JQRDwFvBH4SkS8mcGrkbZYelXT54A/kSSD9SSXvK6LiL70ZauA+SMGI50sqVtS99q1a8caRjpScFIwM2uUKSlI2h94G/CTtK081g4lbQccBSwCdga2BQ7J+v6IWBIRXRHRNW/evLGGQaXkq4/MzIbLkhTeB5wF/DAi7pL0XOCGcfR5MPBARKyNiF7gcuAAYE5aTgLYBVg9jj42q1px+cjMbLgs6xRuIplXqD1eSTJRPFZ/AvaTNB14GjiI5CCfG4A3Ad8DTgCuGEcfm9Xh8pGZ2SayHMc5D/gQyTzCtFp7RBw4lg4j4mZJl5Ec6dkH3AYsISlNfU/SJ9O2C8by+VlVSiUGAvoHgnJJeXZlZjZpZDl57SKS09cOB95L8lf82Gd4gYg4Gzh7WPNKYN/xfO6WqFaSRNDbP0C5NOYpEjOzKSXLnML2EXEB0BsRN0bEu0guH53UqqXkq3sBm5nZoCwjhd70do2kw4CHSU5km9Sq5dpIwVcgmZnVZEkKn5Q0GzgdOA+YBXwg16gKUK0kIwVPNpuZDcpy9dFV6d31wGvyDac4tfKRL0s1MxvUvrukphPN3v/IzGxQ2yaFiieazcw20bZJoVp2+cjMbLhR5xQkfbDZGyPi8xMfTnE6XD4yM9tEs4nmment84B9gCvTx0cAv80zqCK4fGRmtqlRk0JEfAJA0k3A3hGxMX38vxjcLXXSqpWPvE7BzGxQljmFHYGehsc9adukNrh4zSMFM7OaLIvXvgX8VtIP08dHk5yMNqkNjhScFMzMarIsXvuUpJ8C/y1tOjEibss3rPy5fGRmtqmsl6ROBzZExLnAKkmLcoypEC4fmZltarNJQdLZwIdJTl8DqALfyTOoItRGCn0DTgpmZjVZRgpvAI4EngSIiIcZvFx10qrURgp9Lh+ZmdVkSQo9ERFAAEjaNt+QitHhFc1mZpvIkhQulfRVYI6k9wA/B76Wb1j5q5ePnBTMzOqyXH30OUmvBTaQrG7+eERcl3tkOav4kB0zs01kWadAmgQmfSJoVL8k1RPNZmZ1zTbE+2VEvFLSRtL5hNpTQETErNyjy1E9KXii2cysrtlI4R0AETHprzQaSbkkSvI6BTOzRs0mmr8PIGlZQbEUrlouuXxkZtag2UihJOkjwN+NdLbCZD9PAdKk4PKRmVlds5HCsUA/SeKYOcLPpFctyyuazcwaNDtP4T7gM5LuiIifFhhTYSrlkucUzMwaNLv66PiI+A6wh6QXDH9+KpSPOsolelw+MjOrazanUNvOYkYRgbSCy0dmZkM1Kx99Nb39RHHhFMvlIzOzobJsnX2OpFmSqpKWSVor6fgigstbtVzyNhdmZg2ybIj3DxGxATgc+COwO3DGeDqVNEfSZZLulXSPpP0lzZV0naT709vtxtNHFtWyPFIwM2uQJSnUSkyHAd+PiPUT0O+5wM8i4vnAnsA9wJnAsohYDCxLH+eq6vKRmdkQWZLCVZLuBV4GLJM0D3hmrB1Kmg28CrgAICJ6ImIdcBSwNH3ZUuDosfaRVTJScPnIzKxms0khIs4EXgF0RUQvyQlsR42jz0XAWuCbkm6T9PX04J4dI2JN+ppHgB1HerOkkyV1S+peu3btOMLwSMHMbLgsE81vBnojol/SR0nOZ955HH1WgL2B8yNiL5IkM6RU1HjS23ARsSQiuiKia968eeMII0kKfR4pmJnVZSkffSwiNkp6JXAwSdnn/HH0uQpYFRE3p48vI0kSj0raCSC9fWwcfWRSKXm
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.lineplot(x=np.linspace(0.001,10,100),y=outlier_percent)\n",
"plt.ylabel(\"Percentage of Points Classified as Outliers\")\n",
"plt.xlabel(\"Epsilon Value\")"
]
},
{
"cell_type": "code",
"execution_count": 192,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(0.0, 1.0)"
]
},
"execution_count": 192,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAZMAAAEGCAYAAACgt3iRAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAp00lEQVR4nO3deZxddX3/8dd7lmSyZ24WCNlmIvtOmJmAWJVNWSxYBQVF0WJpKyoqbcVSS6u/X6ulglKpLSoK/oSqVCQClVpAUaskk7CERUzIRtgCyWQjZJmZz++Pcya5GTJ3Tmbm3jt35v18PM7jnvM92yfHkI/f7/ec71cRgZmZWX9UlTsAMzOrfE4mZmbWb04mZmbWb04mZmbWb04mZmbWbzXlDqAYJk+eHA0NDeUOw8ysoixatOiViJjSl3OHZDJpaGigtbW13GGYmVUUSav6eq6buczMrN+cTMzMrN+cTMzMrN+cTMzMrN+cTMzMrN+cTMzMrN+Klkwk3SRpraTH88pykn4maWn6W5+WS9L1kpZJekzS3LxzLk6PXyrp4mLFa2ZmfVfMmsl3gDO6lV0J3BcRBwH3pdsAZwIHpculwNchST7A1cA8oAW4uisBFbJ283Yef27jAPwRzMwsi6Ilk4h4EFjfrfhc4OZ0/WbgnXnlt0Tit8BESdOAtwM/i4j1EdEG/IzXJ6jXeWnTNu57au0A/CnMzCyLUveZ7BcRL6TrLwL7pevTgWfzjluTlvVU/jqSLpXUKqm1RsGClesGNnIzM+tR2TrgI5niccCmeYyIGyOiKSKaJoweyeJVG9jZ0TlQlzczswJKnUxeSpuvSH+72qKeA2bmHTcjLeupvKAxI2t4bWeH+03MzEqk12Qi6SRJY9L1iyRdK2l2H+83H+h6I+ti4M688g+mb3WdAGxMm8PuBd4mqT7teH9bWlbQmJHJ+JULVnTvsjEzs2LIUjP5OrBV0jHAFcAzwC29nSTpNuA3wCGS1ki6BPgicLqkpcBp6TbAPcByYBnwDeCjABGxHvgCsDBdPp+WFVRTJeZMHsPClU4mZmalkGUI+vaICEnnAl+LiG+liaGgiLiwh12n7uXYAC7r4To3ATdliHMPzQ05fvrEi3R2BlVV2tfTzcxsH2SpmWyW9FngIuBuSVVAbXHD6r+WxhwbX9vJ79duLncoZmZDXpZk8l5gO3BJRLxI0gl+TVGjGgAtjTnA/SZmZqVQMJlIqgZui4hrI+KXABGxOiJ67TMptxn1o5g2oY6HnEzMzIquYDKJiA6gU9KEEsUzYCTR0phj4Yr1JF0yZmZWLFk64LcASyT9DHi1qzAiPlG0qAZIc0OOOx95nlXrttIweUy5wzEzG7KyJJMfpUvFmdfVb7JyvZOJmVkR9ZpMIuJmSaOAWRHxdAliGjAHTh1L/ehaFqxYz3uaZvZ+gpmZ9UmWL+D/EHgE+Gm6fayk+UWOa0BIorkh5ze6zMyKLMurwX9HMpfIBoCIeASYU7SIBlhLY47V67fy4sZt5Q7FzGzIypJMdkZE9xETK2Y43pa8fhMzMyuOLMnkCUnvA6olHSTpX4D/LXJcA+bwaeMZM6KahW7qMjMrmizJ5OPAESRfwd8GbAI+WcSYBlRNdRVzZ9e738TMrIh6TSYRsTUiroqI5nTyqasioqI6IOY15nj6pc20vbqj3KGYmQ1JPb4aLOkrEfFJST9hLzMiRsQ5RY1sALU0TgKgdVUbpx++Xy9Hm5nZvir0ncl3099/LkUgxXT0jAmMqK5iwYp1TiZmZkXQYzKJiEXp7y9KF05x1NVWc+zMiSxY2VbuUMzMhqRCzVxL2EvzVpeIOLooERVJc2M9//aL5by6vX3XtL5mZjYwCv2r+o6SRVECLY2TuOGBZ1i8uo0/OGhKucMxMxtSenybKyJWRcQq4KNd6/llpQtxYBw/u54q4e9NzMyKIMt3JqfvpezMgQ6k2MaOrOGIAyZ4siwzsyLoMZlI+vO03+RQSY/lLSuAx0oX4sBpaczxyLMb2N7eUe5QzMyGlEI1k1uBPwTuTH+7luMj4qISxDbgWhpzbG/vZMma7kONmZlZfxTqM9kYESuBz5C81dW1jJU0qzThDazmhmTQRzd1mZkNrCzvyN5NkkQE1AGNwNMk43VVlNyYERw0dSwLPYKwmdmAyjLT4lH525LmUoFvc3Vpbszxk0eep6MzqK5SucMxMxsSsrzNtYeIWAzMK0IsJTGvMcfm7e089cKmcodiZjZk9FozkfTpvM0qYC7wfNEiKrKufpMFK9Zz5PQJZY7GzGxoyFIzGZe3jCTpQzm3mEEV0wETRzGjfpTnNzEzG0BZ+kz+XtLYdH1L8UMqvpbGHL94+mUiAsn9JmZm/VWwZiLpo5JWA6uAVZJWSarYzvcuLQ051r26g2defrXcoZiZDQmFvoD/G5LBHt8aEZMiYhJwMnBmuq/PJH1K0hOSHpd0m6Q6SY2SHpK0TNL3JY1Ijx2Zbi9L9zf0596Q1EwAvyJsZjZACtVMPgC8KyKWdxWk6+8BPtjXG0qaDnwCaIqII4Fq4ALgS8B1EXEg0AZckp5yCdCWll+XHtcvjZPHMHnsSPebmJkNkELJJPY213tEvAZ09vO+NcAoSTXAaOAF4BTg9nT/zcA70/Vz023S/aeqnx0dkmhprHcyMTMbIIWSyXOSTu1eKOkUkn/8+yQiniOZCnh1ep2NwCJgQ0S0p4etAaan69OBZ9Nz29PjJ+0lrksltUpqffnll3uNo6Uhx3MbXmNN29a+/lHMzCxV6G2uTwB3SvoVyT/2AE3ASfTj1WBJ9en5jcAG4IfAGX29XpeIuBG4EaCpqanHGSK7NOf1m8yoH93f25uZDWuFBnp8AjgSeBBoSJcHgSPTfX11GrAiIl6OiJ3Aj0gS1MS02QtgBvBcuv4cMBMg3T8BWNeP+wNw6P7jGVdXw4IVnhfezKy/Cn5nkvaZ3DTA91wNnCBpNPAacCrQCjwAnAf8B3AxydD3APPT7d+k+++PiF5rHr2prhLNDTkWrOh3XjIzG/b2eWyu/oqIh0g60hcDS9IYbiQZ6v7TkpaR9Il8Kz3lW8CktPzTwJUDFUtzQ45nXn6VV7ZsH6hLmpkNS1mGoB9wEXE1cHW34uVAy16O3QacX4w4ur43aV25njOOnFaMW5iZDQv7VDORVC/p6GIFU2pHTZ9AXW2VJ8syM+unXpOJpJ9LGi8pR9I09Q1J1xY/tOIbUVPFcTPr/SW8mVk/ZamZTIiITcC7gFsiYh7JG1lDQktjjief38SmbTvLHYqZWcXKkkxqJE0jGUblriLHU3ItjTk6Axat8ivCZmZ9lSWZfB64F1gWEQslzQGWFjes0jlu1kRqqsRC95uYmfVZlvlMfkjylXrX9nLg3cUMqpRGj6jhyOkTPE6XmVk/ZJm2t45k5N4jgLqu8oj44yLGVVLzGnPc9OsVbNvZQV1tdbnDMTOrOFmaub4L7A+8HfgFyVAnm4sZVKm1NObY2RE88uyGcodiZlaRsiSTAyPic8CrEXEzcDYwr7hhlVbT7BwSbuoyM+ujLMmk653ZDZKOJBlocWrxQiq9CaNrOWS/cf7exMysj7IkkxvTYeP/hmTQxScZgNkOB5uWxhyLVrWxs6O/836ZmQ0/vSaTiPhmRLRFxIMRMScipkbEv5ciuFJqacyxdUcHTzy/qdyhmJlVnJKPGjxYtTSkk2W538TMbJ85maSmjq+jYdJoD/poZtYHTiZ5WhpztK5aT2dnv+feMjMbVnr8aFHSuwqdGBE/Gvhwyqu5IccPWtewdO0WDtl/XLnDMTOrGIW+gP/D9Hcq8Ebg/nT7ZOB/SeZuH1LmNU4CYMGKdU4mZmb7oMdmroj4cER8GKgFDo+Id0fEu0mGVaktVYClNDM3iv3H17FgpUcQNjPbF1n6TGZGxAt52y8Bs4oUT1lJorkxx4IV64hwv4mZWVZZksl9ku6V9CFJHwLuBv6nuGGVT0tjjpc2befZ9a+VOxQzs4qRZQj6j0n6I+DNadGNEXFHccMqn3mNyfcmD61Yx6xJo8scjZlZZcj6avBi4O6I+BRwr6Qh2zt
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.lineplot(x=np.linspace(0.001,10,100),y=number_of_outliers)\n",
"plt.ylabel(\"Number of Points Classified as Outliers\")\n",
"plt.xlabel(\"Epsilon Value\")\n",
"plt.xlim(0,1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Do we want to think in terms of percentage targeting instead?\n",
"\n",
"If so, you could \"target\" a percentage, like choose a range producing 1%-5% as outliers."
]
},
{
"cell_type": "code",
"execution_count": 193,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<matplotlib.collections.LineCollection at 0x19a401a0af0>"
]
},
"execution_count": 193,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYMAAAEKCAYAAADw2zkCAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAmgUlEQVR4nO3deZgdZZn+8e/dS7qzdANJujGQBEJANg2BdIDgMog4yKq4EFBUQInbqPxEnFHRGfcZ3BhRGeLgALIL4oKgsiOGJQn7vglCjGQBs0In3f38/qjq5CR0n64kXWfr+3NddXWdqrPcKQ79dL1V7/sqIjAzs6GtrtwBzMys/FwMzMzMxcDMzFwMzMwMFwMzM8PFwMzMgIY831zSM8AKoBvoioiOPD/PzMw2T67FIPWWiFhSgs8xM7PN5GYiMzNDefZAlvQX4CUggHMiYnYfz5kFzAIYOXLktN122y23PJsqAh782zK2bW2mvaWp3HHMzF5l/vz5SyKibUvfJ+9isH1ELJDUDlwHfCoibu3v+R0dHTFv3rzc8myOqV/7I0dO2Y6vv/N15Y5iZvYqkuYPxvXYXJuJImJB+nMRcBWwb56fl4f2liYWrXil3DHMzHKVWzGQNFJSS+868M/Ag3l9Xl7aW5pZtKKz3DHMzHKV591E2wJXSer9nIsj4vc5fl4u2luauPMvq8odw8wsV7kVg4h4Gtgrr/cvlbbWJhav6CQiSAubmVnNGbCZSNJ7C5p7Tpf0S0n75B+tMrS3NLOmu4dlL68tdxQzs9xkuWbw5YhYIemNwMHAucDZ+caqHL23lPq6gZnVsizFoDv9eTgwOyJ+BwzLL1JlWVcMlrsYmFntylIMFkg6B5gJXCOpKePrakJ7azOAby81s5qW5Zf6McAfgEMi4h/AaOC0PENVEjcTmdlQUPRuIkn1wN0RsW6MiIhYCCzMO1ilGNnUwMhh9W4mMrOaVvTMICK6gcckTSxRnorU3trsZiIzq2lZ+hlsAzwk6S5gXe+riDgqt1QVpq2lyc1EZlbTshSDL+eeosK1tzTx0N+WlzuGmVluBryAHBG3AM8Ajen6XODunHNVlPaWZhYtdzORmdWuLD2QTwauAM5JN20P/CrHTBWnvbWJVWu6WdXZVe4oZma5yHJr6SeBNwDLASLiCaA9z1CVxreXmlmty1IMOiNiTe8DSQ0kM5cNGe0tacczNxWZWY3KUgxukfRFYLiktwG/AH6bb6zK0t7qMwMzq21ZisG/AYuBB4CPAtcAp+cZqtK4mcjMat2At5ZGRA/w03QZkrYa3siwhjp3PDOzmtVvMZB0eUQcI+kB+rhGEBFTck1WQSTRNiqZ5MbMrBYVOzP4TPrziFIEqXRtLS4GZla7+i0G6YB0RMSzpYtTudpbmnh26epyxzAzy0WxZqIVJM1DYsNmIgEREa05Z6so7a1NzH3mxXLHMDPLRbEzg5ZSBql07S3NvLR6LWu6ehjWMGTm9jGzISLLcBQ/z7Kt1vXeXrp4pa8bmFntyfIn7p6FD9IeyNPyiVO51nU8cy9kM6tB/RYDSV9IrxtMkbQ8XVYALwC/LlnCCrFuSArfUWRmNajfYhAR306vG3wnIlrTpSUixkTEF0qYsSK4F7KZ1bIsk9tcK+nNG2+MiFtzyFOxxoxqok6w2M1EZlaDshSD0wrWm4F9gfnAQbkkqlD1dWLMKE9/aWa1KcvYREcWPpY0ATgzr0CVrN1zIZtZjdqcG+afB3Yf7CDVICkGbiYys9oz4JmBpLNY3wO5DpjKEJsDuVd7SzMP/W15uWOYmQ26LNcM5hWsdwGXRMSfc8pT0dpbm1iyspPunqC+TuWOY2Y2aLIUg8uAndP1JyNiyLaTtLc00ROwdFXnun4HZma1oFinswZJZ5BcIzgfuAB4TtIZkhpLFbCStK2bC9kXkc2sthS7gPwdYDQwKSKmRcQ+wGRga+C7JchWcXqHpPC8BmZWa4oVgyOAkyNiRe+GiFgOfBw4LO9glWh9L+Qh21JmZjWqWDGIiOhrustu+pgGsz+S6iXdI+nqzQlYSdp6i4GbicysxhQrBg9L+uDGGyUdDzy6CZ/xGeCRTQ1WiZoa6tl6RKM7nplZzSl2N9EngV9KOolk+AmADmA4cHSWN5c0Hjgc+Cbw2S3IWTHc8czMalGxmc4WAPtJOoj1cxpcExE3bML7nwl8Huh31jRJs4BZABMnTtyEty6P9pZmnxmYWc3JMjbRjcCNm/rGko4AFkXEfEkHFnn/2cBsgI6OjszXIsqlraWJZ55ZVe4YZmaDKs/JfN8AHCXpGeBS4CBJF+b4eSXRO1hdH9fWzcyqVm7FICK+EBHjI2JH4Fjgxog4Pq/PK5W2libWdPWw/OWuckcxMxs0AxYDSSMl1aXrr5V01FDtgQzQ3to7/aUvIptZ7chyZnAr0Cxpe+CPwAeA8zblQyLi5og4YtPjVR5Pf2lmtShLMVBErAbeBfwkIt7L+ruLhhz3QjazWpSpGEiaAbwf+F26rT6/SJVtXTOReyGbWQ3JUgw+A3wBuCoiHpK0E3BTvrEq16imBkYMq3czkZnVlCz9DG4luW7Q+/hp4NN5hqp0ngvZzGpNlmkv20h6Ee8JrJvRJSIOyjFXRWtvaWbRcl8zMLPakaWZ6CKSgekmAV8FngHm5pip4rW1NnlOAzOrKVmKwZiIOBdYGxG3RMRJwJA9KwA3E5lZ7ckyB/La9OdCSYcDfyOZAW3Iam9pZmVnF6vXdDFiWJZDaGZW2bL8JvuGpK2AU4GzgFbg/+WaqsK1F0xys+NYFwMzq35Z7ibqnaFsGfCWfONUh965kBet6GTHsSPLnMbMbMvlOWppzWpv8fhEZlZbXAw2Q7vnQjazGuNisBm2HtHIsPo631FkZjWj32sGkorOWRwR3x/8ONVBEm2eC9nMakixC8i98xbvCkwHfpM+PhK4K89Q1aCtxR3PzKx29FsMIuKrAJJuBfaJiBXp4/9g/eilQ1Z7SxPPLl1d7hhmZoMiyzWDbYE1BY/XpNuGtPZWNxOZWe3I0mPqAuAuSVelj98JnJ9boirRNqqZl1avZU1XD8MafB3ezKpblk5n35R0LfCmdNOJEXFPvrEqX2/HsyUrO9lu6+FlTmNmtmWy/kk7AlgeEf8NPC9pUo6ZqoLnQjazWjJgMZD078C/ksx2BtAIXJhnqGqwrhey5zUwsxqQ5czgaOAoYBVARPyN9bedDlmF4xOZmVW7LMVgTUQEEACSPDIbMGbkMCQXAzOrDVmKweWSzgG2lnQycD3w03xjVb6G+jrGjGxisW8vNbMakOVuou9KehuwnKQ38lci4rrck1WB9pYmD1ZnZjUh08ws6S9/F4CNJB3PXAzMrPr120wk6bb05wpJywuWFZKWly5i5Wr3YHVmViOKnRl8ECAihvydQ/1pb2lmyco1dPcE9XUqdxwzs81W7ALyLwAk3VCiLFWnvbWJ7p7gxVVrBn6ymVkFK3ZmUCfpi8Br+5rbYCjPZ9BrfS/kV2hL183MqlGxM4NjgW6SgtHSxzLkta2bC9kXkc2suhWbz+Ax4L8k3R8R15YwU9XoPTNY7NtLzazKFZv28viIuBDYQ9LuG+93MxHrmoZ8R5GZVbti1wx6h50YVYog1ai5sZ6thje6mcjMql6xZqJz0p9f3Zw3ltQM3Ao0pZ9zRUT8++a8VyVzL2QzqwVZhrA+Q1KrpEZJN0haLOn4DO/dCRwUEXsBU4G3S9p/C/NWHE9/aWa1IMtAdf8cEcuBI4BngJ2B0wZ6USRWpg8b0yU2M2fFam9pdjORmVW9LMWgtynpcOAXEbEs65tLqpd0L7AIuC4i7uzjObMkzZM0b/HixVnfumIkQ1J0kozybWZWnbIUg6slPQpMA26Q1AZkaheJiO6ImAqMB/aV9Lo+njM7IjoioqOtrW0ToleGtpYm1nT1sPzlrnJHMTPbbAMWg4j4N+AAoCMi1pLMePaOTfmQiPgHcBPw9s3IWNF8e6mZ1YIsF5DfC6yNiG5Jp5PMf7xdhte1Sdo6XR8OvA14dMviVp5290I2sxqQpZnoyxGxQtIbgYOBc4G
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.lineplot(x=np.linspace(0.001,10,100),y=outlier_percent)\n",
"plt.ylabel(\"Percentage of Points Classified as Outliers\")\n",
"plt.xlabel(\"Epsilon Value\")\n",
"plt.ylim(0,5)\n",
"plt.xlim(0,2)\n",
"plt.hlines(y=1,xmin=0,xmax=2,colors='red',ls='--')"
]
},
{
"cell_type": "code",
"execution_count": 194,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAEGCAYAAABsLkJ6AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAABV6klEQVR4nO3dd5hU1d3A8e+ZXnZmtvdeYXfpS0cpooANsfeuiTGmJybxTfKmmFcToylqEgv22GLvFZCi9L7LNpbtvZfpc98/ZhlYFxGEZVjmfJ6H52Hu3rnzm4E9v7mn/I5QFAVJkiQp9KiCHYAkSZIUHDIBSJIkhSiZACRJkkKUTACSJEkhSiYASZKkEKUJdgBHIzo6WklPTw92GJIkSaPK5s2b2xRFifny8VGVANLT09m0aVOww5AkSRpVhBDVhzouu4AkSZJClEwAkiRJIUomAEmSpBA1qsYADsXtdlNXV4fD4Qh2KEfMYDCQnJyMVqsNdiiSJIWwUZ8A6urqsFgspKenI4QIdjhfS1EU2tvbqaurIyMjI9jhSJIUwkZ9AnA4HKOm8QcQQhAVFUVra2uwQ5Ek6TA6+pzsaeihvc9JUoSJMYlWTPpR32QOcUq8m9HS+O832uKVpFNZR58Th9tLtEWPTqMGoMfu5u8flPL+jsbAeT9cMoZLpqWiUp06v7+nRAKQJEk6Gq09Dmra+3G4vby/vZFPdjdx5rgEbpqXRWqUmaqWviGNP8DDH5UxMyea1ChzkKI+/uQsoONoz549zJw5E71ez3333RfscCRJOoSKpl5uXb6e25/cxI+f24rb6+OMgng+2NHIYysqcXq89Dk9w57n9PiwO71BiHjkyARwHEVGRvL3v/+dn/zkJ8EORZKkQ3B5vDy5upLGzgOzBleWtJARG4YQsLmqnY5eJ+nRZkx69ZDnjkmwkhBuONEhj6iQSwD9r75G07QZ1Cen0jRtBv2vvnbcrh0bG8vUqVPl9E5JOkn12j1sruocdryr38Wl01NZND6BO1/YxrNrq7jvismMTw1HrRLMyYvhV8sKsZp0QYh65ITUGED/q6/R/bM7Uex2ALz19XT/7E4AzBcuC2ZokiSdAFajlmlZUXzwpf79zNgw1le2s6K4GYCypl5WlrTwrxun4fZ4sbu8hBlOvS92IZUAeu+5N9D476fY7fTec69MAJJ0CmvtcVBc301Tt52FBXGoBby7vREh4OKpqSRFGFlV0jzkORq1oLShmz+/U0Kvw0OkWcfvLh5PUWbUsOvXtfezpbqTiuZepmZEMT41HNsouFsIqQTgbWg4quNH4qGHHuLRRx8F4N133yUxMfEbX0uSpKOjKAp7GnsorutGp1FRkBxOZmzYkHO6+l3c81Yxa8sOrL35zsIcnvr2dDr63FS19NHc7eCq2en8Z101Xp8CwLmTkrj7jd04PT4AOvpd3PXydh64ajKZsWEYdP7mc3ddF2tKW3F5fUSF6bnnrd1cMj2Va+dknvRTRkMqAagTE/HW1x/y+Dd1++23c/vttx9LWJIkfY2yxh4+2tVEdVsfi8YnUpQRic2kY2dtF7c/uRG3199oW41aHr5+KtnxlsBz97b0DWn8AR5fWUlegpUfPrslcCwlysQl01P574Yazp6QSEa0mYLkcIoyI3F7fGjUKt7b3sDnFW2sq2jj3AlJdNld3PHUJgZc/tlBOo2K287I4V+flLOwMJ7kyJN7ymhIJQDLz+8cMgYAIIxGLD+/87hcv6mpiaKiInp6elCpVPz1r3+luLgYq9V6XK4vSaFoX2sf331qEz12NwCf7WnlR0vGcOHUFJ5bty/Q+IN/Ade68tYhCWDAdegpnWWNPUOO1bYPMD7FRkbMWNaVtVHR0scFU5Jo6nLgEQIB3LYwh9q2fv71aQWpUSbKGnsDjT+Ay+Nje00nadFmvL4D127oHKC+047FoCE92hy4ewi2kyOKE2R/P3/vPffibWhAnZiI5ed3Hrf+//j4eOrq6o7LtSRJ8itr6g00/vs9vqqS08bE0tw9vAhka49zyOO0aDNhBg19jgOJYFxKOHE245DzEsKN7G3p57GVlf4DeyDaoueciYk8tboKgPRoMzfOzQJgQ2U7DrePL+uxu1lYEB+YMrqztpMfP7c18B6unp3O6WNi8foU0mPMRJj1R/NxHFchNw3UfOEy4jd8QVJdDfEbvpCDv5I0CikKqFWCi6elDvvZaWOG7nyYEmXm1xcUMiE1HItBw/yxcczMiWbD3jbSow900SwoiOO5dfuGPLet14lWfaCZ3NfWT9eAC5UAl9tHYbJt2OsvLIznzPHx6DRqegZc3PtWyZAE9uzafWyu6uC2Jzby0/9spba9/5t+DMcspO4AJEk6+fl6e3EXF+NtbESdlER2cs6wb/A3zM0k1mpgdm40d51fwK66LgZcHublxzMuJZzuARftvU7CjFpirQZWlDRj1Gk4a3wCu2q7WVHSTF6Chatmp/HMmn2Y9RompEbw3/U1Xxtf14ALm1FLXqKVlSUtfGtBNu/vaMTrU7h2Tgbz8+OwGP1TRnvsHiqae4ddwzU4sLyrrpvVpS1cOWt4ZeCOPufg3Y+L1Cgz2XEWNOrj+51dJgBJkk4aPoeDvkcepff+BwLHYv7wOx68dhnv7WiiqrWPcyclMW1wKqbbq9Dn9LC9tou0KDMxFj11HQP87rVdlDf1Ehmm466lBeQn2fjLu3uGvNacvFgEggUFcWTFWtha1c45k5J4dWNt4ByrUYtXUYY8Lz06jF8uLeQfH5ZS0z5AXccA18xOJypMz4zsaMKMB9YLhJu1FCRb2V03dLxBpznQkG/c2z4sAXT0Obnnrd18tsc/eK0ScM9lEzl9bNw3+Vi/kkwAkiSdNDwVFfQ+8Nchx3r+93dkzZzFD5eMGXLc61N46Ysanl3r75/f19rP+sp2bl2QTXmT/1t3R5+LR1dU8KMlY7hgSjJvba3HpyickR+PAP7wxu7A9f7vsgn02D3cPC+LDZXtZMaGcUZBPKv3tKBRC8JNOi6dnspz6/axaHwCM7KjuXCqERC8uL4aFWB3eciKsxBm0JIQbiTMoOXW+Tn88c3dNHc7At1WZY09xFj1tPY4mZMbO+xzKG/qDTT+AD4F/vxOCfnJNqItx68cRdASgBAiBXgaiAMU4BFFUf4WrHgkSQo+X3e3v4P/YB4Pvp7uYee29Dh4aX31kGMOt3dIV1GMVc+c3FieXr0PvUZw7ZwMhBBkx4Xx3Lp9fGtBNi6Pj7RoM29vrWdtWRsmvZqxiTYau+30Oz3MyIliYnoEXp9CeVMPXp+Phk47r2ys4cdLxvDM2n1cNDUFm0nL7oYe7nm7BICLpqZw9oREnllTxZzcGOJsBsYkWukecPO53c2snBgKk22kRpkob+olPdqMdvDOoNvuGvZ+W3ud2F3HtxhdMO8APMCPFUXZIoSwAJuFEB8pilIcxJgkSQoiTUoKqohwfJ1dgWOquDg0KcMHe9UqgUGrDvSnB65x0OKrs8Yl8J/P94ECl89MZ/kq/wyfXy8rpCDZxr8/rSAvwcrEtHDyEmzkxltZXdrC5qoOAJZOTub3rxfT2e9vkFOiTNw0L4vfvrqTpAgTO2q7uXxmGstXVnLRtFRe33RgFuBL62uwGLQU13ezqaoDIeD7i/L4+welDK41491t9XzrjBz++XE5Pzt3LOdMTKLX4SEjJowwg5o+x4EGf3Zu9HH99g9BnAWkKEqjoihbBv/eC5QAScGK51i9//775OXlkZ2dzT333BPscCRpVNKkphL5xBNocnL8jwsKiHz8UdQJ8cPOjbUa+NaC7CHH4m0G0qJN7N9zSadWYdZruKAomVibntvOyCEt2ky4ScfLgwO+N8zN5D/rqlm+qpJn1lZRmBLO6WP83TJOj5eDF/PWtg9Q2tCDxaClz+EmN97Cpr0dpEab2V3XNSzGbTWdZMdZyI6z8KMlY3G6fdw0L5ulU5IB/xjGvtZ+Yix61pa18uqmWm749+f88qXtfG/RmMCMptm50dxxVh5GnXrYaxyLk2IMQAiRDkwC1h/iZ7cCtwKkpg7/FnAy8Hq93H777Xz
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# How to find a good epsilon?\n",
"dbscan = DBSCAN(eps=0.4)\n",
"display_categories(dbscan,two_blobs_outliers)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Do we want to think in terms of number of outliers targeting instead?\n",
"\n",
"If so, you could \"target\" a number of outliers, such as 3 points as outliers."
]
},
{
"cell_type": "code",
"execution_count": 203,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<matplotlib.collections.LineCollection at 0x19a40070670>"
]
},
"execution_count": 203,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEKCAYAAAAfGVI8AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAh6ElEQVR4nO3deZxddX3/8dd7lmRmkswkMwkUSGLCgFhEZBmWJJa1KChCiwhaQyvSxqVVqLZWWpCfP/qr/aFFtCo1ZS+U/hCwLIJCWYUgkCAQAVkSlgTCEpYkkD35/P44Z8JkmZuTmXvvuXPP+/l4nMc995w753wuy3zme77f7+eriMDMzIqrIe8AzMwsX04EZmYF50RgZlZwTgRmZgXnRGBmVnBOBGZmBVexRCDpIkmvSvptn2Odkm6V9HT6OqZS9zczs2wq2SK4BDhyk2PfAG6LiF2B29L3ZmaWI1VyQpmkScCNEbFH+v5J4JCIWCRpB+DOiNitYgGYmdlWNVX5fttHxKJ0/2Vg+/4+KGkGMANgxIgR+77vfe8rayBLV6zh+TeWs8t2I2ltbizrtc3MasGcOXMWR8S4rX2u2olgg4gISf02RyJiJjAToKenJ2bPnl3W+8+at5g/+ff7uXDGgRy4c1dZr21mVgskPZ/lc9UeNfRK+kiI9PXVKt9/g/aWZgCWrFiTVwhmZjWh2ongeuDP0v0/A66r8v036Gh1IjAzg8oOH70SuA/YTdJCSacA/wwcIelp4A/T97loTxPBUicCMyu4ivURRMSn+zl1eKXuuS1GDW9CciIwMyvszOKGBjFqeBNLV67NOxQzs1wVNhEAdLQ1u4/AzAqv2Img1YnAzKzQiaC9pdl9BGZWeIVOBG4RmJkVPBG0tzSzdKUTgZkVW6ETgTuLzcyKngham1m5Zj2r1q7LOxQzs9xsNRFImiZpRLo/XdK5kt5T+dAqr70lmU+3dIXnEphZcWVpEZwPLJf0QeBrwDzgsopGVSXtrjdkZpYpEayNZPWaY4EfRsSPgFGVDas6NtQbcoexmRVYllpDyySdDkwHDpLUADRXNqzqcAVSM7NsLYITgVXAKRHxMjAe+E5Fo6qS3jUJPKnMzIqsZItAUiNwZUQc2nssIl6gTvoIOlyK2sysdIsgItYB6yV1VCmeqmpvTfKgHw2ZWZFl6SN4G5gr6Vbgnd6DEfGVikVVJcObGmlpbnApajMrtCyJ4Np0q0sdrc0sWe4WgZkV11YTQURcKqkVmBgRT1YhpqpyvSEzK7osM4s/DjwM/CJ9v5ek6yscV9W4AqmZFV2W4aP/C9gfeAsgIh4Gdq5YRFXmRGBmRZclEayJiCWbHFtfiWDy0N7qR0NmVmxZOosfk/QnQKOkXYGvALMqG1b1uLPYzIouS4vgy8D7SWYXXwksBU6rYExV1d7SxLJVa1m/PvIOxcwsF1lGDS0H/iHd6k57azMRsGzV2g0zjc3MiqTfRCDpvIg4TdINwGZ/LkfEMRWNrEra+5SZcCIwsyIq1SL4j/T1u9UIJC99K5BOyDkWM7M89JsIImJO+npX9cKpPheeM7OiK/VoaC5beCTUKyL2rEhEVbahFLWHkJpZQZV6NHR01aLIUUebF6cxs2Lrd/hoRDwfEc8DX+rd73useiFWVu8C9k4EZlZUWeYRHLGFY0eVO5C8jBzeRGODWLrCpajNrJhK9RF8keQv/25Jj/Y5NQq4t9KBVYsk2lua3CIws8Iq1Ufwn8DNwLeBb/Q5viwi3qhoVFXmekNmVmSlho8uAZZI+rtNTo2UNDJdu7guuAKpmRVZlqJzPycZRiqgBZgMPElSf6gutLc4EZhZcWWpNfSBvu8l7cMgRw1J+mvgz0kSzFzg5IhYOZhrDkZHazOLlqzI6/ZmZrnKMmpoIxHxEHDAQG8oaSeSUtY9EbEH0Ah8aqDXK4f21iaWeNSQmRXUVlsEkr7a520DsA/wUhnu2yppDdBWhusNijuLzazIsrQIRvXZhpP0GRw70BtGxIskhexeABYBSyLilk0/J2mGpNmSZr/22msDvV0mHa3NrF67npVr1lX0PmZmtShLH8G3JI1M998e7A0ljSFJJJNJ1kH+qaTpEXH5JvedCcwE6OnpqeiqMb31hpasWENLc2Mlb2VmVnNKtggkfUnSC8DzwPOSnpc02PISfwg8GxGvRcQa4Fpg6iCvOSiuQGpmRdZvIpB0BknhuUMioisiuoBDgaPScwP1AnCgpDZJAg4HnhjE9QatvdWF58ysuEq1CE4CjouI+b0H0v0TgD8d6A0j4n7gauAhkqGjDaSPgPKyoUXgDmMzK6BSfQSxpbH9EbFC0vrB3DQizgLOGsw1yqnDLQIzK7BSLYIXJR2+6UFJh5GM9qkbG0pRL3ciMLPiKdUi+ApwnaR7gDnpsR5gGoMYPlqLNixgv9KTysyseEotTPMYsAdwNzAp3e4G9kjP1Y3mxgbahjX60ZCZFVLJeQRpH8FFVYolVx2tzR4+amaFtM21huqVK5CaWVE5EaS8JoGZFdU2JQJJYyTtWalg8pQUnnNnsZkVz1YTgaQ7JbVL6iSZBPbvks6tfGjV1d7a5D4CMyukLC2CjohYChwHXBYRB5DUC6or7iw2s6LKkgiaJO1AUlrixgrHk5v2lmaWrVrLuvUVLXRqZlZzsiSC/w38EngmIh6UtDPwdGXDqj5XIDWzosqyHsFPgZ/2eT8f+EQlg8pD38JzY0YMyzkaM7PqybJUZQtwCvB+oKX3eER8roJxVZ1LUZtZUWV5NPQfwO8BHwHuAsYDyyoZVB7efTTkIaRmVixZEsEuEXEm8E5EXAp8DDigsmFVX3trWoHULQIzK5gsiaD3N+NbkvYAOoDtKhdSPrwmgZkV1Vb7CICZ6YLzZwDXAyOBMysaVQ56F7D3KmVmVjRZRg1dkO7eDexc2XDy0zaskaYGuUVgZoXjonMpSZ5dbGaF5ETQR7srkJpZATkR9OFEYGZF1G8fgaTjSv1gRFxb/nDy1d7S5FLUZlY4pTqLP56+bgdMBW5P3x8KzALqLhF0tDaz8M0VeYdhZlZV/SaCiDgZQNItwO4RsSh9vwNwSVWiqzJ3FptZEWXpI5jQmwRSrwATKxRPrnr7CCJcitrMiiPLhLLbJP0SuDJ9fyLwP5ULKT8drc2sXR8sX72OEcOz/KMxMxv6skwo+ytJfwwclB6aGRE/q2xY+eg7u9iJwMyKIutvu4eAZRHxP5LaJI2KiLqtQLpkxRp26GjNORozs+rIsnj9XwBXAz9JD+0E/HcFY8pNbwVSl6I2syLJ0ln8l8A0YClARDxNHVYfBVcgNbNiypIIVkXE6t43kpqAuhxW40RgZkWUJRHcJenvgVZJR5CsX3xDZcPKx4bOYicCMyuQLIngG8BrwFzg88BNJGsT1J1RLV6lzMyKJ8vw0fXAv6dbXWtqbGDk8CYvTmNmhVKq6NxVEXGCpLlsoU8gIvYc6E0ljQYuAPZIr/25iLhvoNcrpw5XIDWzginVIjgtfT26Avf9PvCLiDhe0jCgrQL3GJB21xsys4IplQhuBPYB/jEiTirXDSV1kMxS/ixAOiJpdamfqab2libPIzCzQimVCIZJ+hNg6pbWJhjEegSTSTqfL5b0QWAOcGpEvNP3Q5JmADMAJk6sXo27zhHDePKVups0bWbWr1Kjhr4A/AEwmmRtgr7bYB4XNZG0NM6PiL2Bd0hGJm0kImZGRE9E9IwbN24Qt9s2O41u5cU3V7gCqZkVRqn1CO4B7pE0OyIuLOM9FwILI+L+9P3VbCER5GVCZxur1q7ntWWr2K69Je9wzMwqrtSoocMi4nbgzXI+GoqIlyUtkLRbRDwJHA48PpBrVcLEzqTfesGby50IzKwQSvURHEyyPOXHt3AuGNxSlV8GrkhHDM0HTh7EtcpqQmdSdXTBGyvY9z05B2NmVgWlHg2dlb6W/Zd0RDwM9JT7uuUwfkzSInjhjeU5R2JmVh1ZylCfKqldiQskPSTpw9UILg8tzY2MGzWcBU4EZlYQWWoNfS4ilgIfBrqAk4B/rmhUOZswppUFbzoRmFkxZEkESl8/Clw
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.lineplot(x=np.linspace(0.001,10,100),y=number_of_outliers)\n",
"plt.ylabel(\"Number of Points Classified as Outliers\")\n",
"plt.xlabel(\"Epsilon Value\")\n",
"plt.ylim(0,10)\n",
"plt.xlim(0,6)\n",
"plt.hlines(y=3,xmin=0,xmax=10,colors='red',ls='--')"
]
},
{
"cell_type": "code",
"execution_count": 204,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAEGCAYAAABsLkJ6AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAABV0klEQVR4nO3dd5hU1f348feZXnZmtvdeYXfpS0epCtgQe+8ajekmMYn5Jj9TNTExxZhEI9GoscXeKyBFet9dtrO99zb9/v6YZWBdRBCWYZnzeh6eh7l7585nBvZ85p7yOUJRFCRJkqTgowp0AJIkSVJgyAQgSZIUpGQCkCRJClIyAUiSJAUpmQAkSZKClCbQARyPyMhIJTU1NdBhSJIkjSnbt29vUxQl6vPHx1QCSE1NZdu2bYEOQ5IkaUwRQlQf6bjsApIkSQpSMgFIkiQFKZkAJEmSgtSYGgM4EpfLRV1dHXa7PdChHDODwUBiYiJarTbQoUiSFMTGfAKoq6vDYrGQmpqKECLQ4XwpRVFob2+nrq6OtLS0QIcjSVIQG/MJwG63j5nGH0AIQUREBK2trYEORZKko+joc7C/oYf2PgcJYSbGxVsx6cd8kznMGfFuxkrjf9BYi1eSzmQdfQ7sLg+RFj06jRqAnkEXf3m/hPf2NPrP++7ycVw+IxmV6sz5/T0jEoAkSdLxaO2xU9Pej93l4b3djXxc2MQ5E+K4dUEGyRFmqlr6hjX+AI9+WMrsrEiSI8wBivrkk7OATqL9+/cze/Zs9Ho9Dz30UKDDkSTpCMqberlj1WbufnIb9zy7E5fHy+K8WN7f08i/VlfgcHvoc7hHPM/h9jLo8AQg4tEjE8BJFB4ezl/+8he+//3vBzoUSZKOwOn28OS6Cho7D80aXFPcQlp0CELA9qp2OnodpEaaMenVw547Ls5KXKjhVIc8qoIuAfS/8ipNM2ZRn5hM04xZ9L/y6km7dnR0NNOnT5fTOyXpNNU76GZ7VeeI4139Tq6YmczSiXHc+/wuntlQxUNXT2VicihqlWBeThT/tzIfq0kXgKhHT1CNAfS/8irdP7wXZXAQAE99Pd0/vBcA8yUrAxmaJEmngNWoZUZGBO9/rn8/PTqEzRXtrC5qBqC0qZc1xS3845YZuNweBp0eQgxn3he7oEoAvQ886G/8D1IGB+l94EGZACTpDNbaY6eovpum7kGW5MWgFvDO7kaEgMumJ5MQZmRtcfOw52jUgpKGbn7/djG9djfhZh2/uGwiBekRI65f197PjupOypt7mZ4WwcTkUGxj4G4hqBKAp6HhuI4fi7/97W88/vjjALzzzjvEx8d/5WtJknR8FEVhf2MPRXXd6DQq8hJDSY8OGXZOV7+TB94sYkPpobU3X1+SxVN3zqSjz0VVSx/N3XaunZvKfzdW4/EqAFwwJYFfv16Iw+0FoKPfyX0v7ebha6eSHh2CQedrPgvrulhf0orT4yUiRM8DbxZy+cxkbpiXftpPGQ2qBKCOj8dTX3/E41/V3Xffzd13330iYUmS9CVKG3v4cF8T1W19LJ0YT0FaODaTjr21Xdz95FZcHl+jbTVqefSm6WTGWvzPrWzpG9b4AzyxpoKcOCvffWaH/1hShInLZybzvy01nDcpnrRIM3mJoRSkh+Nye9GoVby7u4HPytvYWN7GBZMS6Bp08s2ntjHg9M0O0mlU3LU4i398XMaS/FgSw0/vKaNBlQAsP7p32BgAgDAasfzo3pNy/aamJgoKCujp6UGlUvGnP/2JoqIirFbrSbm+JAWjA619fOOpbfQMugD4dH8r31s+jkumJ/HsxgP+xh98C7g2lrUOSwADziNP6Sxt7Bl2rLZ9gIlJNtKixrOxtI3ylj4unpZAU5cdtxAI4K4lWdS29fOPT8pJjjBR2tjrb/wBnG4vu2s6SYk04/EeunZD5wD1nYNYDBpSI83+u4dAOz2iOEUO9vP3PvAgnoYG1PHxWH5070nr/4+NjaWuru6kXEuSJJ/Spl5/43/QE2srOGtcNM3dI4tAtvY4hj1OiTQTYtDQZz+UCCYkhRJjMw47Ly7USGVLP/9aU+E7sB8iLXrOnxzPU+uqAEiNNHPL/AwAtlS0Y3d5+byeQRdL8mL9U0b31nZyz7M7/e/hurmpnD0uGo9XITXKTJhZfzwfx0kVdNNAzZesJHbLJhLqaojdskkO/krSGKQooFYJLpuRPOJnZ40bvvNhUoSZn12cz6TkUCwGDQvHxzA7K5ItlW2kRh7qolmUF8OzGw8Me25brwOt+lAzeaCtn64BJyoBTpeX/ETbiNdfkh/LORNj0WnU9Aw4efDN4mEJ7JkNB9he1cFd/97KD/67k9r2/q/6MZywoLoDkCTp9Ndnd1He3EtLt53YUCPpUSEjvsHfPD+daKuBudmR3HdRHvvquhhwulmQG8uEpFC6B5y09zoIMWqJthpYXdyMUafh3Ilx7KvtZnVxMzlxFq6dm8LT6w9g1muYlBzG/zbXfGl8XQNObEYtOfFW1hS38LVFmby3pxGPV+GGeWkszI3BYvRNGe0ZdFPe3DviGs6hgeV9dd2sK2nhmjkjKwN39DmG7n6cJEeYyYyxoFGf3O/sMgFIknTacLg8PPdZNU8c7IYB7lmew19vKOC9PY1UtfZxwZQEZgxNxXR5FPocbnbXdpESYSbKoqeuY4BfvLqPsqZewkN03Lcij9wEG394Z/+w15qXE41AsCgvhoxoCzur2jl/SgKvbK31n2M1avEoyrDnpUaG8JMV+fz1gxJq2geo6xjg+rmpRITomZUZSYjx0HqBULOWvEQrhXXDxxt0mkMN+dbK9hEJoKPPwQNvFvLpft/gtUrAA1dO5uzxMV/lY/1CMgFIknTaqG7rZ9XaimHH/vR+KU/fNZvvLh837LjHq/Diphqe2eDrnz/Q2s/minbuWJRJWZPvW3dHn5PHV5fzveXjuHhaIm/urMerKCzOjUUAv3q90H+93145iZ5BN7ctyGBLRTvp0SEszotl3f4WNGpBqEnHFTOTeXbjAZZOjGNWZiSXTDcCghc2V6MCBp1uMmIshBi0xIUaCTFouWNhFr95o5Dmbru/26q0sYcoq57WHgfzsqNHfA5lTb3+xh/Aq8Dv3y4mN9FGpOXklaMIWAIQQiQB/wFiAAV4TFGUPwcqHkmSAq9n0MXnvnDj8Sr0Do6cydPSY+fFzdXDjtldnmFdRVFWPfOyo/nPugPoNYIb5qUhhCAzJoRnNx7ga4sycbq9pESaeWtnPRtK2zDp1YyPt9HYPUi/w82srAgmp4bh8SqUNfXg8Xpp6Bzk5a013LN8HE9vOMCl05OwmbQUNvTwwFvFAFw6PYnzJsXz9Poq5mVHEWMzMC7eSveAi88GXczJiiI/0UZyhImypl5SI81oh+4MugedI95va6+DQefJLUYXyDsAN3CPoig7hBAWYLsQ4kNFUYoCGJMkSQEUH2bEatQOGzSNtOiJCzOOOFetEhi0an9/+kGawxZfnTshjv9+dgAUuGp2qv/u4mcr88lLtPHPT8rJibMyOSWUnDgb2bFW1pW0sL2qA4AVUxP55WtFdPb7GuSkCBO3Lsjg/lf2khBmYk9tN1fNTmHVmgounZHMa9sOzQJ8cXMNFoOWovputlV1IAR8e2kOf3m/hKG1Zryzq56vLc7i7x+V8cMLxnP+5AR67W7SokIIMajpsx9q8OdmR57Ub/8QwFlAiqI0KoqyY+jvvUAxkBCoeE7Ue++9R05ODpmZmTzwwAOBDkeSxqT4MBMPXTOFtCjf7JysWAsPXjWZaOvIhi/aauBrizKHHYu1GUiJNHFwzyWdWoVZr+HigkSibXruWpxFSqSZUJOOl4YGfG+en85/N1azam0FT2+oIj8plLPH+bplHG4Phy/mrW0foKShB4tBS5/dRXashW2VHSRHmims6xoR466aTjJjLGTGWPje8vE4XF5uXZDJimmJgG8M40BrP1EWPRtKW3llWy03//MzfvLibr61dJx/RtPc7Ei+eW4ORp16xGuciNNiDEAIkQpMATYf4Wd3AHcAJCePnPJ1OvB4PNx99918+OGHJCY
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# How to find a good epsilon?\n",
"dbscan = DBSCAN(eps=0.75)\n",
"display_categories(dbscan,two_blobs_outliers)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Minimum Samples\n",
"\n",
" | min_samples : int, default=5\n",
" | The number of samples (or total weight) in a neighborhood for a point\n",
" | to be considered as a core point. This includes the point itself.\n",
" \n",
"\n",
"How to choose minimum number of points?\n",
"\n",
"https://stats.stackexchange.com/questions/88872/a-routine-to-choose-eps-and-minpts-for-dbscan"
]
},
{
"cell_type": "code",
"execution_count": 218,
"metadata": {},
"outputs": [],
"source": [
"outlier_percent = []\n",
"\n",
"for n in np.arange(1,100):\n",
" \n",
" # Create Model\n",
" dbscan = DBSCAN(min_samples=n)\n",
" dbscan.fit(two_blobs_outliers)\n",
" \n",
" # Log percentage of points that are outliers\n",
" perc_outliers = 100 * np.sum(dbscan.labels_ == -1) / len(dbscan.labels_)\n",
" \n",
" outlier_percent.append(perc_outliers)\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 226,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Text(0.5, 0, 'Minimum Number of Samples')"
]
},
"execution_count": 226,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAX4AAAEHCAYAAACp9y31AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAArrElEQVR4nO3deXwV9dXH8c9h30EgIKvsIKCoREXBpYrWrS6tu1atVq1Lq9a6tfapWu3TqrVVn2pLqxaLtSqgVbTWDaUqimGXTdkXgYQ9LCHbef6YoYZAbobkTm5y7/f9euV178ydO78zGV6HyW9+c37m7oiISOaol+oARESkZinxi4hkGCV+EZEMo8QvIpJhlPhFRDKMEr+ISIZpENeOzexp4Awg190Hh+vaAi8APYClwPnuvrGyfbVv39579OgRV6giImlp6tSp69w9q/x6i2scv5kdC2wFni2T+B8ENrj7r83sTmA/d7+jsn1lZ2d7Tk5OLHGKiKQrM5vq7tnl18fW1ePuk4AN5VafBYwO348Gzo6rfRER2bua7uPv6O6rw/drgI413L6ISMZL2c1dD/qYKuxnMrNrzCzHzHLy8vJqMDIRkfRW04l/rZl1Aghfcyva0N1HuXu2u2dnZe1xb0JERKqo0sRvZueZWcvw/d1mNt7MDqtie68Cl4fvLwf+WcX9iIhIFUW54v+5u+eb2QhgJPAU8GRlXzKz54HJQH8zW2lmVwG/Bk4ysy/Dff266qGLiEhVRBnHXxK+ng6McvfXzez+yr7k7hdV8NGJUYMTEZHki3LFv8rM/gRcALxhZo0jfk9ERKpo/dad3PfaXLYXFid931ES+PnAv4FvuvsmoC1wW9IjERERAEpKnZtfmMGYT5exfMP2pO8/YVePmdUHprn7gF3rwnH4qyv+loiIVMdj737Jf75cx6+/fRAD9m+V9P0nvOJ39xJggZl1T3rLIiKyh0lf5PHYe1/y7cO6cMHh3WJpI8rN3f2AOWY2Bdi2a6W7nxlLRCIiGWrN5gJufmEGfTu04P6zB2NmsbQTJfH/PJaWRURkN3/8YBFbC4p58dphNGsUW/HkyhO/u39gZgcAfd39HTNrBtSPLSIRkQxUUFTCy9NX8c3B+9OnQ8tY24ry5O7VwFjgT+GqLsArMcYkIpJx3pm3ls07ijhvaNfY24oynPMGYDiwBcDdvwQ6xBmUiEimeSlnJZ1bN2F4n/axtxUl8e9098JdC2bWgARVNUVEZN+s3ryDSV/mce7QrtSvF88N3bKiJP4PzOynQFMzOwl4CXgt3rBERDLH+GmrcIdzh8YzfLO8KIn/TiAPmA1cC7wB3B1nUCIimcLdeSlnBcN6taV7u2Y10maUUT2lwJ/DHxERSaLPlm5k6frt/PCEvjXWZoWJ38xedPfzzWw2e+nTd/eDY41MRCTNvfn5Gu5+5XNaNWnAqQftX2PtJrrivyl8PaMmAhERyRQbthXy839+zuuzVjOocyseOndIrA9slVdhS7smRXf3ZTUWjYhIBvjp+Nm8Nz+Xn5zcj2uP603D+jVb6T5RV08+QRePsXtXjxHMlZ78knEiImluR2EJExfkcvGR3bmxBvv1y0p0xR/vM8MiIhnow4Xr2FlcysgDO6YshiglG/4WZZ2IiFTunblradm4AUf0bJuyGKJ0LA0quxA+uTs0nnBERNJXaanz7vy1HNc/i0YNUjeDbYUtm9ldYT//wWa2JfzJB9YC/6yxCEVE0sSMlZtYt7WQkwamrpsHEiR+d//fsJ//IXdvFf60dPd27n5XDcYoIpIW3pm7lvr1jOP7pbbOZZSBo/8ys2PLr3T3STHEIyKStt6Zt5YjerSldbOGKY0jSuK/rcz7JsARwFTghFgiEhFJQ8vWb+OLtVv5+Rmpn8I8Sq2eb5VdNrNuwO/jCkhEJB29My8XgJEHpn46k6rcVl4JHJjsQERE0tXO4hJeyllB3w4tOKBd81SHU/kVv5k9ztdP7tYDDgGmxRiTiEha+dXr85i/Jp8/Xlo7RsJH6ePPKfO+GHje3T+KKR4RkbTy2syvGD15GVeN6Mkpg2uuAmciURL/C0Cf8P1Cdy+IMR4RkbSxKG8rd46bxWHd23DnqQNSHc5/JXqAq4GZPUjQpz8aeBZYYWYPmllqxyKJiNRyOwpLuH7MNBo1qMf/XXxYjVfgTCRRJA8BbYGe7j7U3Q8DegNtgIdrIDYRkTrJ3bn7lc/5Ijef3194KJ3bNE11SLtJlPjPAK529/xdK9x9C3AdcFrcgYmI1FUv5qxg3LSV/PAbfTiuX1aqw9lDosTv7r63KRdL2MtUjPvCzG4xszlm9rmZPW9mTaqzPxGR2mLuV1v4n3/OYXifdtw0sl+qw9mrRIl/rpldVn6lmV0KzK9qg2bWBfgRkO3ug4H6wIVV3Z+ISG2xpaCI65+bSptmDXn0wkOpX89SHdJeJRrVcwMw3syuJCjRAJANNAXOSUK7Tc2sCGgGfFXN/YmIpJS7c8fYWazYuIPnrx5G+xaNUx1ShRLNwLUKONLMTuDrmvxvuPu71WnQ3VeZ2cPAcmAH8Ja7v1V+OzO7BrgGoHv31Ne2EBFJ5JmPlvKvz9dw16kDUjrJShSVji9y9/fc/fHwp1pJH8DM9gPOAnoCnYHmYfdR+XZHuXu2u2dnZdW+myMiIrtMW76RX70xj5EHduSaY3ulOpxKpWJg6UhgibvnuXsRMB44OgVxiIhU27adxdz43DT2b92E3543BLPa2a9fVioS/3JgmJk1s+A3dCIwLwVxiIhU2+jJS/lqcwG/v+CQlNfZjyrKZOvNzaxe+L6fmZ1ZnSd33f1TYCxBobfZYQyjqro/EZFUyS8oYtSkxXyjfxbZPWp3v35ZUa74JwFNwmGYbwHfBf5anUbd/RfuPsDdB7v7d919Z3X2JyKSCn/9aCmbthdxy0m1c7x+RaIkfnP37cC3gSfc/Ty+HuUjIpKRNu8o4s//WczIAztycNc2qQ5nn0RK/GZ2FHAJ8Hq4rn58IYmI1H5Pf7iELQXF3Dyyb6pD2WdRyjLfBNwFvOzuc8ysFzAx3rBERGqXVZt2cO+rcygsKQVgypINnDJofwZ3aZ3iyPZdlDl3JxH08+9aXkxQckFEJGM88tYXvL8gjwM7tQRgUOdW3HZK/xRHVTVRpl7MAm4n6Nf/bzE1dz8hxrhERGqNxXlbeXn6Sq4c3pO7zxiY6nCqLUof/3MERdl6AvcCS4HPYoxJRKRWefy9hTRuUJ9rj+ud6lCSIkrib+fuTwFF7v6Bu18J6GpfRDLCwtx8/jljFZcdfQBZLWtv4bV9EeXmblH4utrMTieopFl3nlQQEamGR99dSNOG9bn22PS42odoif9+M2sN3Ao8DrQCbok1KhGRFPl44Tpen72aUoeS0lImzPqK64/vTdvmjVIdWtJEGdUzIXy7GfhGvOGIiKTGloIifvX6PP7x2QpaNm5Ak0bB40r9O7bk+yNqf8XNfRHlil9EJK1NXbaRG56bRm5+Adce14tbRvajScP0fU5ViV9EMt59E+ZSz+Dl64czpFubVIcTu1SUZRYRqTW+WJvPzBWbuHJEz4xI+pDgit/Mfpzoi+7+SPLDERGpWS/lrKBBPeOcQ7ukOpQak6irp2X42h84HHg1XP4WMCXOoEREakJRSSkvT1/FiQd2oF0tnhw92RJNtn4vgJlNAg5z9/xw+R6+rtIpIlJnTZyfy7qthZw3tFuqQ6lRUfr4OwKFZZYLw3UiInXaizkryWrZmOP7Z6U6lBoVZVTPs8AUM3s5XD4bGB1bRCIiNSA3v4CJC3L5/jE9aVA/s8a5RHmA6wEz+xdwTLjqe+4+Pd6wRETiNW7qKkpKPeO6eSD6OP5mwBZ3f8bMssysp7sviTMwEZE47Cgs4eG3FvD0R0sY1qstfTq0SHVINS5KPf5fANkEo3ueARoCY4Dh8YYmIpJcnyxezx3jZrFs/XYuHdadO04ZkOqQUiLKFf85wKHANAB3/8rMWib+iohI7bF1ZzG//tc8xnyynO5tm/H81cM4qne7VIeVMlESf6G7u5k5gJk1jzkmEZGk+XjhOm4
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.lineplot(x=np.arange(1,100),y=outlier_percent)\n",
"plt.ylabel(\"Percentage of Points Classified as Outliers\")\n",
"plt.xlabel(\"Minimum Number of Samples\")"
]
},
{
"cell_type": "code",
"execution_count": 229,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAEGCAYAAABsLkJ6AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAABV20lEQVR4nO3dd3iT19n48e/RlmxJ3ntPsM02m4SZABkQsvdO2jTdK23ztu+v803btGmbNG0zaGazmr0HAcLe0zZeGO+9h7ae3x8yAseEQMAIo/O5rlxX9OjRo1sCzq3njPsIRVGQJEmSgo8q0AFIkiRJgSETgCRJUpCSCUCSJClIyQQgSZIUpGQCkCRJClKaQAdwMqKiopS0tLRAhyFJkjSq7Nixo01RlOjPHx9VCSAtLY3t27cHOgxJkqRRRQhRfazjsgtIkiQpSMkEIEmSFKRkApAkSQpSo2oM4FhcLhd1dXXY7fZAh3LCDAYDSUlJaLXaQIciSVIQG/UJoK6uDrPZTFpaGkKIQIfzpRRFob29nbq6OtLT0wMdjiRJQWzUJwC73T5qGn8AIQSRkZG0trYGOhRJko6jo8/BgYYe2vscJIabGJNgwaQf9U3mEOfEpxktjf9hoy1eSTqXdfQ5sLs8RJn16DRqAHpsLv72YSkf7G30n/e9pWO4aloKKtW58+/3nEgAkiRJJ6O1x05Nez92l4cP9jSyqqiJC8bFc8e8TFIiQ6hq6RvS+AM8+nEZM7OjSIkMCVDUp5+cBXQaHThwgJkzZ6LX63nwwQcDHY4kScdQ0dTL3Su3cO9T2/nB87twebwszI/jw72NPLG6EofbQ5/DPex1DrcXm8MTgIhHjkwAp1FERAR/+9vf+OEPfxjoUCRJOgan28NT6ypp7Dwya3BNSQvpMaEIATuq2unodZAWFYJJrx7y2jHxFuLDDGc65BEVdAmg/7XXaZo2g/qkFJqmzaD/tddP27VjYmKYOnWqnN4pSWepXpubHVWdw4539Tu5enoKi8fHc9+Lu3luQxUPXjeZ8SlhqFWCObnR/HxFARaTLgBRj5ygGgPof+11un98H4rNBoCnvp7uH98HQMjlKwIZmiRJZ4DFqGVaZiQffq5/PyMmlC2V7awubgagrKmXNSUt/PP2abjcHmxOD6GGc++HXVAlgN4Hfu9v/A9TbDZ6H/i9TACSdA5r7bFTXN9NU7eNRfmxqAW8t6cRIeDKqSkkhhtZW9I85DUataC0oZs/vltCr91NRIiOX105nsKMyGHXr2vvZ2d1JxXNvUxNj2R8ShjWUXC3EFQJwNPQcFLHT8Tf//53Hn/8cQDee+89EhISvvK1JEk6OYqicKCxh+K6bnQaFflJYWTEhA45p6vfyQNvF7Oh7Mjam28syubpr0+no89FVUsfzd12bpidxn82VuPxKgBcMimR375ZhMPtBaCj38n9r+zhoRsmkxETikHnaz6L6rpYX9qK0+MlMlTPA28XcdX0FG6ek3HWTxkNqgSgTkjAU19/zONf1b333su99957KmFJkvQlyhp7+Hh/E9VtfSwen0BhegRWk459tV3c+9Q2XB5fo20xann01qlkxZn9rz3Y0jek8Qd4ck0lufEWvvfcTv+x5EgTV01P4b9ba7hoQgLpUSHkJ4VRmBGBy+1Fo1bx/p4GNlW0sbGijUsmJNJlc/Ktp7cz4PTNDtJpVNyzMJt/ripnUUEcSRFn95TRoEoA5p/cN2QMAEAYjZh/ct9puX5TUxOFhYX09PSgUqn4y1/+QnFxMRaL5bRcX5KC0aHWPr759HZ6bC4APjvQyveXjuHyqck8v/GQv/EH3wKujeWtQxLAgPPYUzrLGnuGHKttH2B8spX06LFsLGujoqWPy6Yk0tRlxy0EArhnUTa1bf3889MKUiJNlDX2+ht/AKfby56aTlKjQvB4j1y7oXOA+k4bZoOGtKgQ/91DoJ0dUZwhh/v5ex/4PZ6GBtQJCZh/ct9p6/+Pi4ujrq7utFxLkiSfsqZef+N/2JNrKzlvTAzN3cOLQLb2OIY8To0KIdSgoc9+JBGMSw4j1moccl58mJGDLf08sabSd+AARJn1XDwxgafXVQGQFhXC7XMzAdha2Y7d5eXzemwuFuXH+aeM7qvt5AfP7/J/hhtnp3H+mBg8XoW06BDCQ/Qn83WcVkE3DTTk8hXEbd1MYl0NcVs3y8FfSRqFFAXUKsGV01KGPXfemKE7HyZHhvCLywqYkBKG2aBh/thYZmZHsfVgG2lRR7poFuTH8vzGQ0Ne29brQKs+0kweauuna8CJSoDT5aUgyTrs/RcVxHHB+Dh0GjU9A05+/3bJkAT23IZD7Kjq4J5/b+NH/9lFbXv/V/0aTllQ3QFIknT267O7qGjupaXbTlyYkYzo0GG/4G+bm0GMxcDsnCjuX5bP/rouBpxu5uXFMS45jO4BJ+29DkKNWmIsBlaXNGPUabhwfDz7a7tZXdJMbryZG2an8uz6Q4ToNUxICee/W2q+NL6uASdWo5bcBAtrSlr42oIsPtjbiMercPOcdObnxWI2+qaM9tjcVDT3DruGc3BgeX9dN+tKW7h+1vDKwB19jsG7HycpkSFkxZrRqE/vb3aZACRJOms4XB5e2FTNk4e7YYAfLM3l4ZsL+WBvI1WtfVwyKZFpg1MxXR6FPoebPbVdpEaGEG3WU9cxwK9e3095Uy8RoTruX55PXqKVP713YMh7zcmNQSBYkB9LZoyZXVXtXDwpkde21frPsRi1eBRlyOvSokL52fICHv6olJr2Aeo6BrhpdhqRoXpmZEURajyyXiAsREt+koWiuqHjDTrNkYZ828H2YQmgo8/BA28X8dkB3+C1SsAD10zk/LGxX+Vr/UIyAUiSdNaobutn5drKIcf+8mEZz94zk+8tHTPkuMer8PLmGp7b4OufP9Taz5bKdu5ekEV5k+9Xd0efk8dXV/D9pWO4bEoSb++qx6soLMyLQwC/ebPIf73/u2YCPTY3d87LZGtlOxkxoSzMj2PdgRY0akGYScfV01N4fuMhFo+PZ0ZWFJdPNQKCl7ZUowJsTjeZsWZCDVriw4yEGrTcPT+b371VRHO33d9tVdbYQ7RFT2uPgzk5McO+h/KmXn/jD+BV4I/vlpCXZCXKfPrKUQQsAQghkoFngFhAAR5TFOWvgYpHkqTA67G5+NwPbjxehV7b8Jk8LT12Xt5SPeSY3eUZ0lUUbdEzJyeGZ9YdQq8R3DwnHSEEWbGhPL/xEF9bkIXT7SU1KoR3dtWzoawNk17N2AQrjd02+h1uZmRHMjEtHI9XobypB4/XS0OnjVe31fCDpWN4dsMhrpiajNWkpaihhwfeKQHgiqnJXDQhgWfXVzEnJ5pYq4ExCRa6B1xssrmYlR1NQZKVlEgT5U29pEWFoB28M+i2OYd93tZeBzbn6S1GF8g7ADfwA0VRdgohzMAOIcTHiqIUBzAmSZICKCHciMWoHTJoGmXWEx9uHHauWiUwaNX+/vTDNEctvrpwXDz/2XQIFLh2Zpr/7uIXKwrIT7Lyr08ryI23MDE1jNx4KzlxFtaVtrCjqgOA5ZOT+PUbxXT2+xrk5EgTd8zL5Jev7SMx3MTe2m6unZnKyjWVXDEthTe2H5kF+PKWGswGLcX13Wyv6kAI+M7iXP72YSmDa814b3c9X1uYzT8+KefHl4zl4omJ9NrdpEeHEmpQ02c/0uDPzok6rb/+IYCzgBRFaVQUZefg//cCJUBioOI5VR988AG5ublkZWXxwAMPBDocSRqVEsJNPHj9JNKjfbNzsuPM/P7aicRYhjd8MRYDX1uQNeRYnNVAapSJw3su6dQqQvQaLitMIsaq556F2aRGhRBm0vHK4IDvbXMz+M/GalaureTZDVUUJIdx/hhft4zD7eHoxby17QOUNvRgNmjps7vIiTOz/WAHKVEhFNV1DYtxd00nWbFmsmLNfH/pWBwuL3fMy2L5lCTAN4ZxqLWfaLOeDWWtvLa9ltv+tYmfvbyHby8e45/RNDsnim9dmItRpx72HqfirBgDEEKkAZOALcd47m7gboCUlOFTvs4GHo+He++9l48//pi
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"num_dim = two_blobs_outliers.shape[1]\n",
"\n",
"dbscan = DBSCAN(min_samples=2*num_dim)\n",
"display_categories(dbscan,two_blobs_outliers)"
]
},
{
"cell_type": "code",
"execution_count": 230,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAEGCAYAAABsLkJ6AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAABV0klEQVR4nO3dd5hU1f348feZXnZmtvdeYXfpS0epCtgQe+8ajekmMYn5Jj9TNTExxZhEI9GoscXeKyBFet9dtrO99zb9/v6YZWBdRBCWYZnzeh6eh7l7585nBvZ85p7yOUJRFCRJkqTgowp0AJIkSVJgyAQgSZIUpGQCkCRJClIyAUiSJAUpmQAkSZKClCbQARyPyMhIJTU1NdBhSJIkjSnbt29vUxQl6vPHx1QCSE1NZdu2bYEOQ5IkaUwRQlQf6bjsApIkSQpSMgFIkiQFKZkAJEmSgtSYGgM4EpfLRV1dHXa7PdChHDODwUBiYiJarTbQoUiSFMTGfAKoq6vDYrGQmpqKECLQ4XwpRVFob2+nrq6OtLS0QIcjSVIQG/MJwG63j5nGH0AIQUREBK2trYEORZKko+joc7C/oYf2PgcJYSbGxVsx6cd8kznMGfFuxkrjf9BYi1eSzmQdfQ7sLg+RFj06jRqAnkEXf3m/hPf2NPrP++7ycVw+IxmV6sz5/T0jEoAkSdLxaO2xU9Pej93l4b3djXxc2MQ5E+K4dUEGyRFmqlr6hjX+AI9+WMrsrEiSI8wBivrkk7OATqL9+/cze/Zs9Ho9Dz30UKDDkSTpCMqberlj1WbufnIb9zy7E5fHy+K8WN7f08i/VlfgcHvoc7hHPM/h9jLo8AQg4tEjE8BJFB4ezl/+8he+//3vBzoUSZKOwOn28OS6Cho7D80aXFPcQlp0CELA9qp2OnodpEaaMenVw547Ls5KXKjhVIc8qoIuAfS/8ipNM2ZRn5hM04xZ9L/y6km7dnR0NNOnT5fTOyXpNNU76GZ7VeeI4139Tq6YmczSiXHc+/wuntlQxUNXT2VicihqlWBeThT/tzIfq0kXgKhHT1CNAfS/8irdP7wXZXAQAE99Pd0/vBcA8yUrAxmaJEmngNWoZUZGBO9/rn8/PTqEzRXtrC5qBqC0qZc1xS3845YZuNweBp0eQgxn3he7oEoAvQ886G/8D1IGB+l94EGZACTpDNbaY6eovpum7kGW5MWgFvDO7kaEgMumJ5MQZmRtcfOw52jUgpKGbn7/djG9djfhZh2/uGwiBekRI65f197PjupOypt7mZ4WwcTkUGxj4G4hqBKAp6HhuI4fi7/97W88/vjjALzzzjvEx8d/5WtJknR8FEVhf2MPRXXd6DQq8hJDSY8OGXZOV7+TB94sYkPpobU3X1+SxVN3zqSjz0VVSx/N3XaunZvKfzdW4/EqAFwwJYFfv16Iw+0FoKPfyX0v7ebha6eSHh2CQedrPgvrulhf0orT4yUiRM8DbxZy+cxkbpiXftpPGQ2qBKCOj8dTX3/E41/V3Xffzd13330iYUmS9CVKG3v4cF8T1W19LJ0YT0FaODaTjr21Xdz95FZcHl+jbTVqefSm6WTGWvzPrWzpG9b4AzyxpoKcOCvffWaH/1hShInLZybzvy01nDcpnrRIM3mJoRSkh+Nye9GoVby7u4HPytvYWN7GBZMS6Bp08s2ntjHg9M0O0mlU3LU4i398XMaS/FgSw0/vKaNBlQAsP7p32BgAgDAasfzo3pNy/aamJgoKCujp6UGlUvGnP/2JoqIirFbrSbm+JAWjA619fOOpbfQMugD4dH8r31s+jkumJ/HsxgP+xh98C7g2lrUOSwADziNP6Sxt7Bl2rLZ9gIlJNtKixrOxtI3ylj4unpZAU5cdtxAI4K4lWdS29fOPT8pJjjBR2tjrb/wBnG4vu2s6SYk04/EeunZD5wD1nYNYDBpSI83+u4dAOz2iOEUO9vP3PvAgnoYG1PHxWH5070nr/4+NjaWuru6kXEuSJJ/Spl5/43/QE2srOGtcNM3dI4tAtvY4hj1OiTQTYtDQZz+UCCYkhRJjMw47Ly7USGVLP/9aU+E7sB8iLXrOnxzPU+uqAEiNNHPL/AwAtlS0Y3d5+byeQRdL8mL9U0b31nZyz7M7/e/hurmpnD0uGo9XITXKTJhZfzwfx0kVdNNAzZesJHbLJhLqaojdskkO/krSGKQooFYJLpuRPOJnZ40bvvNhUoSZn12cz6TkUCwGDQvHxzA7K5ItlW2kRh7qolmUF8OzGw8Me25brwOt+lAzeaCtn64BJyoBTpeX/ETbiNdfkh/LORNj0WnU9Aw4efDN4mEJ7JkNB9he1cFd/97KD/67k9r2/q/6MZywoLoDkCTp9Ndnd1He3EtLt53YUCPpUSEjvsHfPD+daKuBudmR3HdRHvvquhhwulmQG8uEpFC6B5y09zoIMWqJthpYXdyMUafh3Ilx7KvtZnVxMzlxFq6dm8LT6w9g1muYlBzG/zbXfGl8XQNObEYtOfFW1hS38LVFmby3pxGPV+GGeWkszI3BYvRNGe0ZdFPe3DviGs6hgeV9dd2sK2nhmjkjKwN39DmG7n6cJEeYyYyxoFGf3O/sMgFIknTacLg8PPdZNU8c7IYB7lmew19vKOC9PY1UtfZxwZQEZgxNxXR5FPocbnbXdpESYSbKoqeuY4BfvLqPsqZewkN03Lcij9wEG394Z/+w15qXE41AsCgvhoxoCzur2jl/SgKvbK31n2M1avEoyrDnpUaG8JMV+fz1gxJq2geo6xjg+rmpRITomZUZSYjx0HqBULOWvEQrhXXDxxt0mkMN+dbK9hEJoKPPwQNvFvLpft/gtUrAA1dO5uzxMV/lY/1CMgFIknTaqG7rZ9XaimHH/vR+KU/fNZvvLh837LjHq/Diphqe2eDrnz/Q2s/minbuWJRJWZPvW3dHn5PHV5fzveXjuHhaIm/urMerKCzOjUUAv3q90H+93145iZ5BN7ctyGBLRTvp0SEszotl3f4WNGpBqEnHFTOTeXbjAZZOjGNWZiSXTDcCghc2V6MCBp1uMmIshBi0xIUaCTFouWNhFr95o5Dmbru/26q0sYcoq57WHgfzsqNHfA5lTb3+xh/Aq8Dv3y4mN9FGpOXklaMIWAIQQiQB/wFiAAV4TFGUPwcqHkmSAq9n0MXnvnDj8Sr0Do6cydPSY+fFzdXDjtldnmFdRVFWPfOyo/nPugPoNYIb5qUhhCAzJoRnNx7ga4sycbq9pESaeWtnPRtK2zDp1YyPt9HYPUi/w82srAgmp4bh8SqUNfXg8Xpp6Bzk5a013LN8HE9vOMCl05OwmbQUNvTwwFvFAFw6PYnzJsXz9Poq5mVHEWMzMC7eSveAi88GXczJiiI/0UZyhImypl5SI81oh+4MugedI95va6+DQefJLUYXyDsAN3CPoig7hBAWYLsQ4kNFUYoCGJMkSQEUH2bEatQOGzSNtOiJCzOOOFetEhi0an9/+kGawxZfnTshjv9+dgAUuGp2qv/u4mcr88lLtPHPT8rJibMyOSWUnDgb2bFW1pW0sL2qA4AVUxP55WtFdPb7GuSkCBO3Lsjg/lf2khBmYk9tN1fNTmHVmgounZHMa9sOzQJ8cXMNFoOWovputlV1IAR8e2kOf3m/hKG1Zryzq56vLc7i7x+V8cMLxnP+5AR67W7SokIIMajpsx9q8OdmR57Ub/8QwFlAiqI0KoqyY+jvvUAxkBCoeE7Ue++9R05ODpmZmTzwwAOBDkeSxqT4MBMPXTOFtCjf7JysWAsPXjWZaOvIhi/aauBrizKHHYu1GUiJNHFwzyWdWoVZr+HigkSibXruWpxFSqSZUJOOl4YGfG+en85/N1azam0FT2+oIj8plLPH+bplHG4Phy/mrW0foKShB4tBS5/dRXashW2VHSRHmims6xoR466aTjJjLGTGWPje8vE4XF5uXZDJimmJgG8M40BrP1EWPRtKW3llWy03//MzfvLibr61dJx/RtPc7Ei+eW4ORp16xGuciNNiDEAIkQpMATYf4Wd3AHcAJCePnPJ1OvB4PNx99918+OGHJCY
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"num_dim = two_blobs_outliers.shape[1]\n",
"\n",
"dbscan = DBSCAN(eps=0.75,min_samples=2*num_dim)\n",
"display_categories(dbscan,two_blobs_outliers)"
]
},
{
"cell_type": "code",
"execution_count": 231,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAEGCAYAAABsLkJ6AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAABaOklEQVR4nO3dd3hUVfrA8e+ZPplMeiGNhCT03psIiKjYe++FteyKvazlt6urYtdV1GXtFV17pQgogiIiPaH39J7MZPrM+f0xYSCEnoQB5nyeh0fm5M6ZdwZz3rnnnvseIaVEURRFiTyacAegKIqihIdKAIqiKBFKJQBFUZQIpRKAoihKhFIJQFEUJULpwh3AwUhKSpI5OTnhDkNRFOWo8ueff1ZJKZN3bz+qEkBOTg6LFy8OdxiKoihHFSHE1j21qykgRVGUCKUSgKIoSoRSCUBRFCVCHVXXAPbE6/VSVFSEy+UKdyh7ZTKZyMzMRK/XhzsURVGUkKM+ARQVFWG1WsnJyUEIEe5wWpBSUl1dTVFREZ06dQp3OIqiKCFHfQJwuVxH7OAPIIQgMTGRysrKcIeiKMpB8FdV4V2xAn9FJbrsjuh790YTHR3usNrUUZ8AgCN28N/hSI9PUSKZv6oK6XCgTU1FGI3Btvp66h95FOdnn4eOi33kn1iuuRqhOXYunR4770RRFOUA+cvKcP36G87Zs6l7+P8oHzmK2jvvxrtpMwC+teuaDf4A9Y8/gW/LljBE235UAmgj06dPp2vXruTn5zN58uRwh6Moyl54V6+m8uxzqb7gQmquvBo8HsxnnIHziy9oeO45pNtNwNbQ8okuF7LRcdjjbU8qAbQBv9/PLbfcwg8//EBhYSEfffQRhYWF4Q5LUZTdSLebhn+/hH/79lCb64fp6Lp0BiHwLPgVX0UFus6dEbvN9+v79EaXlXm4Q25XEZcAGj//grIhwyjO7EjZkGE0fv5Fq/tctGgR+fn55ObmYjAYuPjii/nqq6/aIFpFUdqSv74ez4JfW7QHqquxXH8d5nPPofa6G7C/8iqJb72Jfshg0Goxjh9P/PPPo4mLO/xBt6Nj4iLwgWr8/Avq77kX6XQC4C8upv6eewGwnHvOIfdbXFxMVlZW6HFmZia///5764JVFKXNaePiMB5/PM4vmn/x03Xrivvnebi++x4Ab0EBru9/IOnLz5FuN9LhQMRYwxFyu4qoBGCb/GRo8N9BOp3YJj/ZqgSgKMqRzVdainfZcnwlxZjPPAN0Wpz/+xSEIOrqq9F2zMb1w/TmTzLo8S5fQd0DDyLr69EkJRE/5WVMx41s0b93y1bcv/2Kb/UajCNHYhgyGG18/GF6d4cuohKAv6TkoNoPVEZGBtt3mVMsKioiIyOjVX0qirJ/Ukq8K1fiWboMYTRi6N8ffdcuzY7x19RQd9/fcf/4Y6gt5u/3Y5kxHVlRgXf9egIlJUTfdCP21/4Dfj8Alosuovauu6GpykCgqoraG28i4f130XftisZsBsC9ZCmuH38EtxttcjJ1992P5dprsN5y8xG/ZDSiEoA2PR1/cfEe21tj8ODBrF+/ns2bN5ORkcG0adP48MMPW9Wnoig7eQoKcH71Nb4NGzCfcw7G40aijY/Hs3gxVRdeDB4PACIujqT/fYKhR/fQc31r1zYb/AEannuexF69qL7iylCbNrcTlmuvofHtdzCffx66/M4YBvTHOHJksH+9Hsenn+GeMxf3nLmYL7wAWVNL9cWXIBsbg50YjcTedy/1Tz5F1BlnoMvJbv8PpxWO7PTUxqz33Ytoyto7CLMZ6333tqpfnU7Hyy+/zMknn0z37t258MIL6dmzZ6v6VBQlyLt+PVUXXox9yiu4Zsyk9sabcH7xBdLnw/6fqaHBH0DW1eGeM6fZ8wM7BudduVx4ClY1a/Jv2oxh8CDiHvsXsrYW79rVWC69FHQ6EAIExNx/L+h02J59Ds/ixTi/+Wbn4A/gduNetAhdXh7S7ws1+7ZvxzV/Pp6VKwnsNg0dThF1BrBjnt82+Un8JSVo09Ox3ndvm8z/n3rqqZx66qmt7kdRlOa8BYXIurpmbbbnXsB48sn4i1tO3/rLypo91uXlIWJikA071/YbBg1Em958mlablYV37Trszz4XbJg+A02HVKIuuAD7Sy8H+8rPx3r7JADc834JTQ/tKlBXh/msM9BlBpeMuhf/Sc3VVxOoDb6H6JtvwnTySeDzo+ucjzYx8cA/jDYWUWcAEEwCHRYtJKNoGx0WLVQXfxXliCdbtkiJRqvFcs3VLX5mOml8s8f6Tp2If+E5DIMHI2JjMZ06AePYsXh++QVdfn7oOPPpp9H46mvNnhsoK0fsUsXXt2EDgeoa0GjA7Ubfv3+L1zefdSZRZ52FMBoJ1NVRd//fQ4M/gP2VV3Ev+JWq886n+upr8G7ecoCfQ9uLqDMARVGOfAGbDW9hIf7SUrQZGei6dm3xDd562yS0HTpgGncCcc8+g2fxnwQcjZgnTMAwaBCB2lr85RVoYmPRpnXA+f0PCIuFqLPPxrNkCa7vf0DfqxdRN91I45RX0FitGIYMxv7W2/uPr6YGTXw8+t69cf0wHes9d+P87HOk34/1rzdjOvVUtLGxAPjr6vDt6aZQtxsA75KluGbOQP+Xv7Q4xF9ZibegkEBdLbrcPPTduzVLRm1BJQBFUY4YAZcL+9T/Ynvu+VBbzL8eIenjj3B89jm+9euJuvBCjMePAkB6PAQaGvAsXowuLxdthw74tmyh9rY78BUUoElOJu7ZpzH060v9gw/j3uW1jCeNR6PRYD7jdPRdu+L6dSFRF16A4933Qsdo4uMgEGgWo65zZ+KeeYr6Rx/Dv2kTvq1bsN58MyIpCdMJY9Fad94voElMRN+/P96lS5u/0aaicwDu+Quw7pYA/JWV1N57H+4ZM5s60pDw+n8xn3zSIXyqe6cSgKIoRwzfhg3Ynn+hWVvDPx4hZcZ04v75j2bt0u+n8c23sL/yavC569fj/nke1nvuxldQAECgshLbs88R889/EnX5ZTimfQx+P+YzTkcIQd3td4T6i399KoHaWqx33oH753nounXFfMbpOGfMBL0eTUIC0dddi/21/2A+52xMY0ajvfJyQGB//Q3QaAg4nRi6dUVjjUHXMQut1UrM3XdRd9fdweXmOh2Wq6/CW1CAJq0DgdIyTCeOa/E5eAsLdw7+AIEAdX9/AEP/fmhTUtrks4YwJgAhRBbwLpBKcJJvqpTyxXDFoyhK+AXq60HuNufv8xFoqG9xrL+0FPsbbzZrk04nsn7nsZq0DpjGjaNxyhQwmrDecjMIga5HD+yvvYb1nruD6/fz83FM+xj3j7MR0dHo+/bBX1REwGbHNHYsxqFDwe/HW1CA9Pnwb9tG4zvvEvuvR7BPeRXLVVeiSUjAs3QpDffeB4DlqiuJOv98bFNewTT+RLTp6ej79CFQW4t77lxMY8diGDgQTV4u3sLV6PLzEAZD8HPY5ZpB6LMpKyPQ2Ii2lZ/xrsJ5BuAD7pRSLhFCWIE/hRCzpJSqipqiRChdVhaa+LhmA6AmNRVdVscWxwqdDk2UmYDb3fwHu8yTR511Fvap/wUpib7hemwvBL9jxr34PIb+/bE99TT63r2xDh2KoVcvDD174pw5M1QvKOqSS6i77XYC1dVA8F4B6x23U/e3SWizs/H88SfRE2+g4fkXsFx1JY73Pwi9duMbb6KJicG7bBmeBQtACGL/8TD1/3w0NK3k+PQzYu65m5onJhM3+QmiLjifQENDsBjdbtc9jCeOQ5ua2roPeDdhWwUkpSyVUi5p+rsNWA0clbfPXnvttaSkpNCrV69wh6IoRzVdx44kvPUWus6dg4979iThjf+iTevQ4lhthw5Y7767eVtGBrr8vOC6fUAYjQhrNJbLL0OTkYH1vnvR5eWhTUik8c23ALBOuhX7a//B9sKL2F55FcPAgZhOPhkA6XKBdud3bv+mzfhWrETExiIbGtD16olr/nx0ebkt5/kB96JF6Lt3R9ejO7GPPkLA6cJ6x+1EXXpJ8ACPB9/6DWg7dMA1ezb2996ncsJp1N5
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"dbscan = DBSCAN(min_samples=1)\n",
"display_categories(dbscan,two_blobs_outliers)"
]
},
{
"cell_type": "code",
"execution_count": 232,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAEGCAYAAABsLkJ6AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAABXYUlEQVR4nO3dd3hUVfrA8e+ZXpNJryQhCb33JgIiKth7V6xrW3tdy2/Xil3Xupa117VXmqgIioj0hN7TSE9mkulzfn9MGAihQxhwzud5fGRu7tx5ZyDnnXvKe4SUEkVRFCX2aKIdgKIoihIdKgEoiqLEKJUAFEVRYpRKAIqiKDFKJQBFUZQYpYt2AHsjOTlZ5uXlRTsMRVGUw8qff/5ZLaVM2f74YZUA8vLymDdvXrTDUBRFOawIITbs6LjqAlIURYlRKgEoiqLEKJUAFEVRYtRhNQawI36/n5KSEjweT7RD2SmTyUR2djZ6vT7aoSiKokQc9gmgpKQEu91OXl4eQohoh9OGlJKamhpKSkro2LFjtMNRFEWJOOwTgMfjOWQbfwAhBElJSVRVVUU7FEVR9kKwuhr/4sUEK6vQ5eag79ULjc0W7bAOqMM+AQCHbOO/xaEen6LEsmB1NbK5GW1aGsJoDB9raKDh/gdwf/pZ5Lz4+/+F9ZKJCM1fZ+j0r/NOFEVR9lCwogLPr7/h/uEH6u/7PzaPGEndLbfhX7sOgMCKla0af4CGhx8hsH59FKJtPyoBHCCTJ0+mS5cuFBYWMmnSpGiHoyjKTviXLaPqlNOoOfMsai+aCD4f5hNPxP355zQ+9RTS6yXkbGz7RI8H2dR80ONtTyoBHADBYJBrr72W77//nuLiYj744AOKi4ujHZaiKNuRXi+N/36O4KZNkWOe7yej69wJhMA3+1cClZXoOnVCbNffr+/dC12H7IMdcruKuQTQ9NnnVAweSml2DhWDh9L02ef7fc25c+dSWFhIfn4+BoOBc845hy+//PIARKsoyoEUbGjAN/vXNsdDNTVYL78M82mnUnfZFbhefImkN/6LfvAg0GoxjhtHwtNPo3E4Dn7Q7egvMQi8p5o++5yG2+9Aut0ABEtLabj9DgCsp526z9ctLS2lQ4cOkcfZ2dn8/vvv+xesoigHnNbhwHjkkbg/b/3FT9e1C96fZ+L59jsA/EVFeL77nuQvPkN6vcjmZkScPRoht6uYSgDOSY9GGv8tpNuNc9Kj+5UAFEU5tAXKy/EvXESgrBTzSSeCTov7f5+AEFgmTkSbk4vn+8mtn2TQ41+0mPq770E2NKBJTibhhecxHTGizfX96zfg/e1XAsuWYxwxAsPgQWgTEg7Su9t3MZUAgmVle3V8T2VlZbFpmz7FkpISsrKy9uuaiqLsnpQS/5Il+BYsRBiNGPr1Q9+lc6tzgrW11N/5D7zTp0eOxf3jLqxTJiMrK/GvWkWorAzb1Vfhevk/EAwCYD37bOpuvQ1aqgyEqqupu+pqEt99G32XLmjMZgC88xfgmT4dvF60KSnU33kX1ksvwX7tNYf8lNGYSgDazEyCpaU7PL4/Bg0axKpVq1i3bh1ZWVl8+OGHvP/++/t1TUVRtvIVFeH+8isCq1djPvVUjEeMQJuQgG/ePKrPOgd8PgCEw0Hy/z7G0L1b5LmBFStaNf4AjU89TVLPntRceFHkmDa/I9ZLL6Hpzbcwn3E6usJOGPr3wzhiRPj6ej3Nn3yKd8aPeGf8iPmsM5G1ddSccy6yqSl8EaOR+DvvoOHRx7CceCK6vNz2/3D2w6Gdng4w+513IFqy9hbCbMZ+5x37dV2dTsfzzz/PscceS7du3TjrrLPo0aPHfl1TUZQw/6pVVJ91Dq4XXsQzZSp1V12N+/PPkYEArv+8Emn8AWR9Pd4ZM1o9P7Slcd6Wx4OvaGmrQ8G16zAMGojjoQeRdXX4VyzDet55oNOBECAg7q47QKfD+eRT+ObNw/3111sbfwCvF+/cuegKCpDBQORwYNMmPLNm4VuyhNB23dDRFFN3AFv6+Z2THiVYVoY2MxP7nXcckP7/CRMmMGHChP2+jqIorfmLipH19a2OOZ96BuOxxxIsbdt9G6yoaPVYV1CAiItDNm6d228YOABtZutuWm2HDvhXrMT15FPhA5OnoElPw3Lmmbieez58rcJC7DfdAIB35i+R7qFtherrMZ98Irrs8JRR77w/qZ04kVBd+D3Yrrka07HHQCCIrlMh2qSkPf8wDrCYugOAcBJInzuHrJKNpM+dowZ/FeWQJ9sekRKNVov1koltfmY6Zlyrx/qOHUl45ikMgwYh4uMxTRiPccwYfL/8gq6wMHKe+YTjaXrp5VbPDVVsRmxTxTewejWhmlrQaMDrRd+vX5vXN598EpaTT0YYjYTq66m/6x+Rxh/A9eJLeGf/SvXpZ1Az8RL869bv4edw4MXUHYCiKIe+kNOJv7iYYHk52qwsdF26tPkGb7/xBrTp6ZjGHoXjySfwzfuTUHMT5vHjMQwcSKiujuDmSjTx8Wgz0nF/9z3CasVyyin45s/H89336Hv2xHL1VTS98CIaux3D4EG43nhz9/HV1qJJSEDfqxee7ydjv/023J9+hgwGsV93DaYJE9DGxwMQrK8nsKNFoV4vAP75C/BMnYL+b39rc0qwqgp/UTGh+jp0+QXou3VtlYwOBJUAFEU5ZIQ8HlyvvIrzqacjx+IevJ/kjz6g+dPPCKxaheWsszAeORIA6fMRamzEN28euoJ8tOnpBNavp+7GmwkUFaFJScHx5OMY+vah4Z778G7zWsZjxqHRaDCfeAL6Ll3w/DoHy1ln0vz2O5FzNAkOCIVaxajr1AnHE4/R8MBDBNeuJbBhPfZrrkEkJ2M6agxa+9b1ApqkJPT9+uFfsKD1G20pOgfgnTUb+3YJIFhVRd0dd+KdMrXlQhoSX3sV87HH7MOnunMqASiKcsgIrF6N8+lnWh1r/Of9pE6ZjONf/2x1XAaDNP33DVwvvhR+7qpVeH+eif322wgUFQEQqqrC+eRTxP3rX1guOJ/mDz+CYBDziScghKD+ppsj10t47RVCdXXYb7kZ788z0XXtgvnEE3BPmQp6PZrERGyXXYrr5f9gPvUUTKNHob3oAkDgeu110GgIud0YunZBY49Dl9MBrd1O3G23Un/rbeHp5jod1okX4y8qQpORTqi8AtPRY9t8Dv7i4q2NP0AoRP0/7sbQry/a1NQD8llDFBOAEKID8DaQRriT7xUp5bPRikdRlOgLNTSA3K7PPxAg1NjQ5txgeTmu1//b6ph0u5ENW8/VZKRjGjuWphdeAKMJ+7XXgBDounfH9fLL2G+/LTx/v7CQ5g8/wjv9B4TNhr5Pb4IlJYScLkxjxmAcMgSCQfxFRchAgODGjTS99TbxD96P64WXsF58EZrERHwLFtB4x50AWC++CMsZZ+B84UVM445Gm5mJvndvQnV1eH/8EdOYMRgGDEBTkI+/eBm6wgKEwRD+HLYZM4h8NhUVhJqa0O7nZ7ytaN4BBIBbpJTzhRB24E8hxDQppaqipigxStehA5oER6sGUJOWhq5DTptzhU6HxmIm5PW2/sE2/eSWk0/G9cqrICW2Ky7H+Uz4O6bj2acx9OuH87HH0ffqhX3IEAw9e2Lo0QP31KmRekGWc8+l/sabCNXUAOG1Avabb6L+7zegzc3F98ef2K68gsann8F68UU0v/te5LWbXv8vmrg4/AsX4ps9G4Qg/p/30fCvByLdSs2ffErc7bdR+8gkHJMewXLmGYQaG8PF6LYb9zAePRZtWtr+fcDbidosIClluZRyfsufncAy4LBcPnvppZeSmppKz549ox2KohzWdDk5JL7xBrpOncKPe/Qg8fVX0WaktzlXm56O/bbbWh/LykJXWBCetw8IoxFht2G94Hw0WVnY77wDXUEB2sQkmv77BgD2G67H9fJ/cD7zLM4XX8IwYACmY48FQHo8oN36nTu4dh2BxUsQ8fHIxkZ0PXvgmTULXUF+235+wDt3Lvpu3dB170b8A/cTcnuw33wTlvPODZ/g8xFYtRptejqeH37A9c67VI0/nrqrriL
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"dbscan = DBSCAN(eps=0.75,min_samples=1)\n",
"display_categories(dbscan,two_blobs_outliers)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"----"
]
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
}
},
"nbformat": 4,
"nbformat_minor": 1
}