You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

266 KiB

<html> <head> </head>

___

Copyright by Pierian Data Inc. For more information, visit us at www.pieriandata.com

Matrix Plots

NOTE: Make sure to watch the video lecture, not all datasets are well suited for a heatmap or clustermap.

Imports

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

The Data

World Population Prospects publishes United Nations population estimates for all world countries and every year from 1950 to 2020, as well as projections for different scenarios (low, middle and high variants) from 2020 to 2100. The figures presented here correspond to middle variant projections for the given year.

https://www.ined.fr/en/everything_about_population/data/all-countries/?lst_continent=900&lst_pays=926

Source : Estimates for the current year based on data from the World Population Prospects. United Nations.

In [2]:
# 2020 Projections
df = pd.read_csv('country_table.csv')
In [3]:
df
Out[3]:
Countries Birth rate Mortality rate Life expectancy Infant mortality rate Growth rate
0 AFRICA 32.577 7.837 63.472 44.215 24.40
1 ASIA 15.796 7.030 73.787 23.185 8.44
2 EUROPE 10.118 11.163 78.740 3.750 0.38
3 LATIN AMERICA AND THE CARIBBEAN 15.886 6.444 75.649 14.570 8.89
4 NORTHERN AMERICA 11.780 8.833 79.269 5.563 6.11
5 OCEANIA 16.235 6.788 78.880 16.939 12.79
6 WORLD 17.963 7.601 72.766 27.492 10.36

Heatmap

In [4]:
df = df.set_index('Countries')
In [5]:
df
Out[5]:
Birth rate Mortality rate Life expectancy Infant mortality rate Growth rate
Countries
AFRICA 32.577 7.837 63.472 44.215 24.40
ASIA 15.796 7.030 73.787 23.185 8.44
EUROPE 10.118 11.163 78.740 3.750 0.38
LATIN AMERICA AND THE CARIBBEAN 15.886 6.444 75.649 14.570 8.89
NORTHERN AMERICA 11.780 8.833 79.269 5.563 6.11
OCEANIA 16.235 6.788 78.880 16.939 12.79
WORLD 17.963 7.601 72.766 27.492 10.36
In [6]:
# Clearly shows life expectancy in different units
sns.heatmap(df)
Out[6]:
<AxesSubplot:ylabel='Countries'>
In [7]:
rates = df.drop('Life expectancy',axis=1)
In [8]:
sns.heatmap(rates)
Out[8]:
<AxesSubplot:ylabel='Countries'>
In [9]:
sns.heatmap(rates,linewidth=0.5)
Out[9]:
<AxesSubplot:ylabel='Countries'>
In [10]:
sns.heatmap(rates,linewidth=0.5,annot=True)
Out[10]:
<AxesSubplot:ylabel='Countries'>
In [11]:
# Note how its not palette here
sns.heatmap(rates,linewidth=0.5,annot=True,cmap='viridis')
Out[11]:
<AxesSubplot:ylabel='Countries'>
In [12]:
# Set colorbar based on value from dataset
sns.heatmap(rates,linewidth=0.5,annot=True,cmap='viridis',center=40)
Out[12]:
<AxesSubplot:ylabel='Countries'>
In [13]:
# Set colorbar based on value from dataset
sns.heatmap(rates,linewidth=0.5,annot=True,cmap='viridis',center=1)
Out[13]:
<AxesSubplot:ylabel='Countries'>

Clustermap

Plot a matrix dataset as a hierarchically-clustered heatmap.

In [14]:
sns.clustermap(rates)
Out[14]:
<seaborn.matrix.ClusterGrid at 0x158e27976c8>
In [15]:
sns.clustermap(rates,col_cluster=False)
Out[15]:
<seaborn.matrix.ClusterGrid at 0x158e235c9c8>
In [16]:
sns.clustermap(rates,col_cluster=False,figsize=(12,8),cbar_pos=(-0.1, .2, .03, .4))
Out[16]:
<seaborn.matrix.ClusterGrid at 0x158e2ffc848>
In [17]:
rates.index.set_names('',inplace=True)
In [18]:
rates
Out[18]:
Birth rate Mortality rate Infant mortality rate Growth rate
AFRICA 32.577 7.837 44.215 24.40
ASIA 15.796 7.030 23.185 8.44
EUROPE 10.118 11.163 3.750 0.38
LATIN AMERICA AND THE CARIBBEAN 15.886 6.444 14.570 8.89
NORTHERN AMERICA 11.780 8.833 5.563 6.11
OCEANIA 16.235 6.788 16.939 12.79
WORLD 17.963 7.601 27.492 10.36
In [19]:
# Recall you can always edit the DF before seaborn
sns.clustermap(rates,col_cluster=False,figsize=(12,8),cbar_pos=(-0.1, .2, .03, .4))
Out[19]:
<seaborn.matrix.ClusterGrid at 0x158e354b508>


</html>