Calendar with pandas and plot.ly dash

plot.ly dash is a great framework for building dashboards – but that’s not the only use for it. I tend to use plot.ly dash for all my web-app needs – mostly because I haven’t hit a wall yet where I can’t achieve something using plot.ly dash that I can be using another framework.

I recently got to thinking whether I could build a calendar using a DataTable in dash. Why not? If I can get some set of dataframes in the shape of a calendar then I could display it in dash and use all the nice features of dash_table.DataTable to format it.

First we need a dataframe in the shape of a traditional calendar that I can index by year and month. I will then display this calendar on a plot.ly dash app in Part 2.

Part 1: Calendar with pandas

Initialise the dataframe with the dates and use pandas .dt accessors to get relevant information:

df = pd.DataFrame({"date": pd.date_range('2021-12-01', '2040-12-31')})

df["week_day"] = df.date.dt.weekday
df['day'] = df.date.dt.day
df["week"] = df.date.dt.isocalendar().week
df['month'] = df.date.dt.month
df["year"] = df.date.dt.year

?Apparently dt.week is deprecated so use dt.isocalendar().week

date  week_day  day  week  month  year
0    2021-12-01         2    1    48     12  2021
1    2021-12-02         3    2    48     12  2021
...         ...       ...  ...   ...    ...   ...
6969 2040-12-30         6   30    52     12  2040
6970 2040-12-31         0   31     1     12  2040
df = df.set_index(['year', 'month', 'week','week_day'])
df.head(10)
date  day
year month week week_day
2021 12    48   2        2021-12-01    1
                3        2021-12-02    2
                4        2021-12-03    3
                5        2021-12-04    4
                6        2021-12-05    5
           49   0        2021-12-06    6
                1        2021-12-07    7
                2        2021-12-08    8
                3        2021-12-09    9
                4        2021-12-10   10

Great! Now we can use the unstack method to pivot the week_day level of the index so we get something that looks like a traditional calendar. I want to keep the initial dataframe as a meta table and display the unstacked version so I will keep both.

calendar = df.unstack('week_day')['day']
week_day            0     1     2     3     4     5     6
year month week
2021 12    48     NaN   NaN   1.0   2.0   3.0   4.0   5.0
           49     6.0   7.0   8.0   9.0  10.0  11.0  12.0
           50    13.0  14.0  15.0  16.0  17.0  18.0  19.0
           51    20.0  21.0  22.0  23.0  24.0  25.0  26.0
           52    27.0  28.0  29.0  30.0  31.0   NaN   NaN

Doing this I realised something was wrong with the last week of December 2021 and first week of January 2022. The first week of January seems to come out as week number 52 instead of 1.

week_day     0     1     2     3     4     5     6
week
1          3.0   4.0   5.0   6.0   7.0   8.0   9.0
2         10.0  11.0  12.0  13.0  14.0  15.0  16.0
3         17.0  18.0  19.0  20.0  21.0  22.0  23.0
4         24.0  25.0  26.0  27.0  28.0  29.0  30.0
5         31.0   NaN   NaN   NaN   NaN   NaN   NaN
52         NaN   NaN   NaN   NaN   NaN   1.0   2.0

After some research I learned that as part of the ISO date and time standard that pandas uses, the first week of the year is the week with the first Thursday of the year. Which means the first week of January is actually the 52nd week of the year as it doesn’t have a Thursday in it. We can fix this by changing that week’s week number to 0 and pretend the week numbers start from 0 instead of 1.

df.loc[(df['month'] == 1) & (df['week'] == 52), 'week'] = 0
week_day     0     1     2     3     4     5     6
week
0          NaN   NaN   NaN   NaN   NaN   1.0   2.0
1          3.0   4.0   5.0   6.0   7.0   8.0   9.0
2         10.0  11.0  12.0  13.0  14.0  15.0  16.0
3         17.0  18.0  19.0  20.0  21.0  22.0  23.0
4         24.0  25.0  26.0  27.0  28.0  29.0  30.0
5         31.0   NaN   NaN   NaN   NaN   NaN   NaN

Now it’s much better. I also want to display if a day is a public holiday in England in my final calendar. I will use the python holidays library to achieve this.

uk_holidays = holidays.country_holidays('UK', 'England')
df['holiday'] = df["date"].apply(lambda x: x in uk_holidays)

uk_holidays is a dictionary so we get a boolean column. You can also get the names of the holidays by using the .get method.

df['holiday_name'] = df["date"].apply(uk_holidays.get)
date  day  holiday                      holiday_name
week week_day
22   2        2022-06-01    1    False                              None
     3        2022-06-02    2     True               Spring Bank Holiday
     4        2022-06-03    3     True  Platinum Jubilee of Elizabeth II
     5        2022-06-04    4    False                              None
     6        2022-06-05    5    False                              None

Finally I want my calendar to have the weekday names instead of weekday numbers so I will rename the columns

day_names = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
calendar = calendar.rename(columns = {k:v for k,v in enumerate(day_names)})
week_day         Monday  Tuesday  Wednesday  Thursday  Friday  Saturday  Sunday
year month week
2021 12    48       NaN      NaN        1.0       2.0     3.0       4.0     5.0
           49       6.0      7.0        8.0       9.0    10.0      11.0    12.0
           50      13.0     14.0       15.0      16.0    17.0      18.0    19.0
           51      20.0     21.0       22.0      23.0    24.0      25.0    26.0
           52      27.0     28.0       29.0      30.0    31.0       NaN     NaN

Looks more like a calendar now. We need to wrap everything in a function, format it nicely and we’ll use this in Part 2. See you next time ?

import pandas as pd
import holidays


UK_HOLIDAYS = holidays.country_holidays("UK", "England")
DAY_NAMES = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]


def make_calendar(start_date="2021-12-01", end_date="2040-12-31"):

    df = pd.DataFrame({"date": pd.date_range(start_date, end_date)})
    df["week_day"] = df["date"].dt.weekday
    df["day"] = df["date"].dt.day
    df["week"] = df["date"].dt.isocalendar().week
    df["month"] = df["date"].dt.month
    df["year"] = df["date"].dt.year

    df["holiday"] = df["date"].apply(lambda x: x in UK_HOLIDAYS)
    df["holiday_name"] = df.date.apply(UK_HOLIDAYS.get)
		
    df.loc[(df["month"] == 1) & (df["week"] == 52), "week"] = 0

    df = df.set_index(["year", "month", "week", "week_day"])

    calendar = df.unstack("week_day")["day"]
    calendar = calendar.rename(columns={k: v for k, v in enumerate(DAY_NAMES)})

    return calendar, df

By Zeynep Bicer

Published
Categorized as blog

Leave a comment

Your email address will not be published. Required fields are marked *