You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
56 KiB
56 KiB
<html>
<head>
</head>
</html>
Pandas Time Series Exercise Set #1 - Solution¶
For this set of exercises we'll use a dataset containing monthly milk production values in pounds per cow from January 1962 to December 1975.
IMPORTANT NOTE! Make sure you don't run the cells directly above the example output shown,
otherwise you will end up writing over the example output!
otherwise you will end up writing over the example output!
In [16]:
# RUN THIS CELL
import pandas as pd
%matplotlib inline
df = pd.read_csv('../Data/monthly_milk_production.csv', encoding='utf8')
title = "Monthly milk production: pounds per cow. Jan '62 - Dec '75"
print(len(df))
print(df.head())
So df has 168 records and 2 columns.
1. What is the current data type of the Date column?¶
HINT: We show how to list column dtypes in the first set of DataFrame lectures.
In [ ]:
# CODE HERE
In [17]:
# DON'T WRITE HERE
df.dtypes
Out[17]:
2. Change the Date column to a datetime format¶
In [ ]:
In [18]:
# DON'T WRITE HERE
df['Date']=pd.to_datetime(df['Date'])
df.dtypes
Out[18]:
3. Set the Date column to be the new index¶
In [ ]:
In [19]:
# DON'T WRITE HERE
df.set_index('Date',inplace=True)
df.head()
Out[19]:
4. Plot the DataFrame with a simple line plot. What do you notice about the plot?¶
In [ ]:
In [20]:
# DON'T WRITE HERE
df.plot();
# THE PLOT SHOWS CONSISTENT SEASONALITY, AS WELL AS AN UPWARD TREND
5. Add a column called 'Month' that takes the month value from the index¶
HINT: You have to call df.index as df['Date'] won't work.
BONUS: See if you can obtain the name of the month instead of a number!
In [ ]:
In [28]:
# DON'T WRITE HERE
df['Month']=df.index.month
df.head()
Out[28]:
In [22]:
# BONUS SOLUTION:
df['Month']=df.index.strftime('%B')
df.head()
Out[22]:
6. Create a BoxPlot that groups by the Month field¶
In [ ]:
In [29]:
# DON'T WRITE HERE
df.boxplot(by='Month',figsize=(12,5));