plot.ly dash is a great framework for building dashboards – but that’s not the only use for it. I tend to use plot.ly dash for all my web-app needs – mostly because I haven’t hit a wall yet where I can’t achieve something using plot.ly dash that I can be using another framework.
I recently got to thinking whether I could build a calendar using a DataTable
in dash. Why not? If I can get some set of dataframes in the shape of a calendar then I could display it in dash and use all the nice features of dash_table.DataTable
to format it.
First we need a dataframe in the shape of a traditional calendar that I can index by year and month. I will then display this calendar on a plot.ly dash app in Part 2.
Part 1: Calendar with pandas
Initialise the dataframe with the dates and use pandas .dt
accessors to get relevant information:
df = pd.DataFrame({"date": pd.date_range('2021-12-01', '2040-12-31')})
df["week_day"] = df.date.dt.weekday
df['day'] = df.date.dt.day
df["week"] = df.date.dt.isocalendar().week
df['month'] = df.date.dt.month
df["year"] = df.date.dt.year
💡Apparently dt.week
is deprecated so use dt.isocalendar().week
date week_day day week month year
0 2021-12-01 2 1 48 12 2021
1 2021-12-02 3 2 48 12 2021
... ... ... ... ... ... ...
6969 2040-12-30 6 30 52 12 2040
6970 2040-12-31 0 31 1 12 2040
df = df.set_index(['year', 'month', 'week','week_day'])
df.head(10)
date day
year month week week_day
2021 12 48 2 2021-12-01 1
3 2021-12-02 2
4 2021-12-03 3
5 2021-12-04 4
6 2021-12-05 5
49 0 2021-12-06 6
1 2021-12-07 7
2 2021-12-08 8
3 2021-12-09 9
4 2021-12-10 10
Great! Now we can use the unstack method to pivot the week_day
level of the index so we get something that looks like a traditional calendar. I want to keep the initial dataframe as a meta table and display the unstacked version so I will keep both.
calendar = df.unstack('week_day')['day']
week_day 0 1 2 3 4 5 6
year month week
2021 12 48 NaN NaN 1.0 2.0 3.0 4.0 5.0
49 6.0 7.0 8.0 9.0 10.0 11.0 12.0
50 13.0 14.0 15.0 16.0 17.0 18.0 19.0
51 20.0 21.0 22.0 23.0 24.0 25.0 26.0
52 27.0 28.0 29.0 30.0 31.0 NaN NaN
Doing this I realised something was wrong with the last week of December 2021 and first week of January 2022. The first week of January seems to come out as week number 52 instead of 1.
week_day 0 1 2 3 4 5 6
week
1 3.0 4.0 5.0 6.0 7.0 8.0 9.0
2 10.0 11.0 12.0 13.0 14.0 15.0 16.0
3 17.0 18.0 19.0 20.0 21.0 22.0 23.0
4 24.0 25.0 26.0 27.0 28.0 29.0 30.0
5 31.0 NaN NaN NaN NaN NaN NaN
52 NaN NaN NaN NaN NaN 1.0 2.0
After some research I learned that as part of the ISO date and time standard that pandas uses, the first week of the year is the week with the first Thursday of the year. Which means the first week of January is actually the 52nd week of the year as it doesn’t have a Thursday in it. We can fix this by changing that week’s week number to 0 and pretend the week numbers start from 0 instead of 1.
df.loc[(df['month'] == 1) & (df['week'] == 52), 'week'] = 0
week_day 0 1 2 3 4 5 6
week
0 NaN NaN NaN NaN NaN 1.0 2.0
1 3.0 4.0 5.0 6.0 7.0 8.0 9.0
2 10.0 11.0 12.0 13.0 14.0 15.0 16.0
3 17.0 18.0 19.0 20.0 21.0 22.0 23.0
4 24.0 25.0 26.0 27.0 28.0 29.0 30.0
5 31.0 NaN NaN NaN NaN NaN NaN
Now it’s much better. I also want to display if a day is a public holiday in England in my final calendar. I will use the python holidays
library to achieve this.
uk_holidays = holidays.country_holidays('UK', 'England')
df['holiday'] = df["date"].apply(lambda x: x in uk_holidays)
uk_holidays
is a dictionary so we get a boolean column. You can also get the names of the holidays by using the .get
method.
df['holiday_name'] = df["date"].apply(uk_holidays.get)
date day holiday holiday_name
week week_day
22 2 2022-06-01 1 False None
3 2022-06-02 2 True Spring Bank Holiday
4 2022-06-03 3 True Platinum Jubilee of Elizabeth II
5 2022-06-04 4 False None
6 2022-06-05 5 False None
Finally I want my calendar to have the weekday names instead of weekday numbers so I will rename the columns
day_names = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
calendar = calendar.rename(columns = {k:v for k,v in enumerate(day_names)})
week_day Monday Tuesday Wednesday Thursday Friday Saturday Sunday
year month week
2021 12 48 NaN NaN 1.0 2.0 3.0 4.0 5.0
49 6.0 7.0 8.0 9.0 10.0 11.0 12.0
50 13.0 14.0 15.0 16.0 17.0 18.0 19.0
51 20.0 21.0 22.0 23.0 24.0 25.0 26.0
52 27.0 28.0 29.0 30.0 31.0 NaN NaN
Looks more like a calendar now. We need to wrap everything in a function, format it nicely and we’ll use this in Part 2. See you next time 👋
import pandas as pd
import holidays
UK_HOLIDAYS = holidays.country_holidays("UK", "England")
DAY_NAMES = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
def make_calendar(start_date="2021-12-01", end_date="2040-12-31"):
df = pd.DataFrame({"date": pd.date_range(start_date, end_date)})
df["week_day"] = df["date"].dt.weekday
df["day"] = df["date"].dt.day
df["week"] = df["date"].dt.isocalendar().week
df["month"] = df["date"].dt.month
df["year"] = df["date"].dt.year
df["holiday"] = df["date"].apply(lambda x: x in UK_HOLIDAYS)
df["holiday_name"] = df.date.apply(UK_HOLIDAYS.get)
df.loc[(df["month"] == 1) & (df["week"] == 52), "week"] = 0
df = df.set_index(["year", "month", "week", "week_day"])
calendar = df.unstack("week_day")["day"]
calendar = calendar.rename(columns={k: v for k, v in enumerate(DAY_NAMES)})
return calendar, df
By Zeynep Bicer