The International 2022 Schedule Calendar in python

The International

Dota 2’s The International has been the biggest annual esports tournament since its debut in 2011. For reference, last year’s prize pool was over $40 million, and it’s crowdfunded too! For fans of Dota 2 like myself, this time of the year is comparable to Christmas.

Motivation

This year, the event is being held in Singapore and the structure of the event has changed so that the schedule is spread accross multiple weeks. With the time zone difference and the confusing new schedule, I found myself having to go to liquipedia.net many times a day to check which games are being played and when. As the series are played during night time for my time zone, if I wanted to check the schedule the next day, I would have to see spoilers for who won. Dota 2 have their own schedule page with an option to hide spoilers but it involves too many clicks for my liking.

After all, laziness and impatience are important virtues for a programmer. Naturally, I wanted to write a script that will create an .ics (iCalendar) file for all matches converted to my time zone, which I can import to my personal calendar so that I can view the schedule spoiler-free by swiping to my calendar widget on my phone’s home screen.

Web Scraping

Fortunately, the HTML content of the website I use to check the schedule is structured and predictable. Each match has an info icon associated with it which when clicked opens a small pop-up which contains the information we need: names of teams and the date/time of the match.

There are only a few cases to consider:

  • Both teams are known – match time is known
  • One or both teams unknown or winner/loser of another match that hasn’t been played yet – match time is known
  • Teams known/unknown – match time unknown

We will ignore the cases where the match time is unknown because it won’t be useful to display them on the calendar.

The match pop-up card looks like this:

By inspecting the page for the cases above, we identify some class names for HTML elements that will be used to scrape the match information:

  • The match details will always be in the <div> element with the class name brkts-match-info-popup
  • Teams (known or TBD) will appear in the <div> element with the class name .brkts-popup-header-opponent-{direction} where direction is either left or right.
  • If the team is known name of the team will be within the <span> element with the class name name
  • If the team is the winner/loser of another match or simply TBD, that information will be in the <div> element with the class name brkts-opponent-block-literal
  • If a team is the winner/loser of another game the information will appear as a string, if it’s TBD there will be a zero-width space character instead
  • The date and time of the match come from the <div> element with the class name timer-object and follows a format (see image above). Note: The time zone in the image is BST because the website is aware of my location, however, when using requests.get(), the time zone defaults to the local time zone of the event which is SGT(UTC+8).

There are 3 pages we’re interested in Last Chance Qualifier, Group Stage and Main Event. These will be the path names at the end of the request URL.

Equipped with this information we can set up the web scraping function as follows:

import requests
from bs4 import BeautifulSoup
from typing import List
from datetime import datetime
import pytz


def get_matches() -> List[dict]:
    """
    Get match details for The International 2022.

    Returns
    -------
    List[dict]
        Each dictionary describes one match/series
    """
    all_matches = []
    for pathname in ["Last_Chance_Qualifier", "Group_Stage", "Main_Event"]:
        page = requests.get(
            f"https://liquipedia.net/dota2/The_International/2022/{pathname}"
        )
        soup = BeautifulSoup(page.content, "html.parser")
        match_details = soup.findAll("div", class_="brkts-popup brkts-match-info-popup")
        matches = []
        for m in match_details:
            teams = []
            for direction in ["left", "right"]:
                team_info = m.select_one(f".brkts-popup-header-opponent-{direction}")
                team = team_info.select_one(".name")
                if not team:  # Team names unknown/not decided
                    team = team_info.select_one(".brkts-opponent-block-literal")
                team = team.get_text()
                if team == "\u200b":  # Zero width space character
                    team = "TBD"
                teams.append(team)

            team_left, team_right = teams

            dt = m.select_one(".timer-object")
            if dt:
                dt = dt.get_text()
                dt = dt.replace(
                    "SGT", "UTC+0800"
                )  # Converting time zone part to expected format
                dt = datetime.strptime(dt, "%B %d, %Y - %H:%M %Z%z").astimezone(
                    pytz.timezone("Europe/London")
                )  # Converting times to BST
            match = {
                "team_left": team_left,
                "team_right": team_right,
                "date": dt if dt else None,
                "description": (" ").join(pathname.split("_")),
            }
            if match["date"]:  # Don't need to save dateless events
                matches.append(match)
        if matches:
            all_matches += matches
    return all_matches

A match from the resulting list will look like this:

{
    "team_left": "Team Secret",
    "team_right": "Tempest",
    "date": "2022-10-08 03:05:00+01:00",
    "description": "Last Chance Qualifier"
}
Calendar

Now that we have the match information, we can move on to the iCalendar (using the ics python package).

The ics package has 2 classes we’re interested in: Calendar and Event.

We initialise the calendar with calendar = Calendar(). Then we can create events and add them to the calendar. An ics Event() has various properties we can populate such as name, begin, duration and description.

  • The datetime of the match will be the begin property for each match.
  • As series don’t have a set end time, we’ll just add 2.5 hours as the duration.
  • The name of the event will be of the form Team A vs Team B.
  • The description will be the portion of the event the match is in.

We import our get_matches function with other required packages and convert the matches to iCalendar events:

from ics import Calendar, Event
from datetime import timedelta
from acquisition import get_matches


def export_calendar():
    """Create an .ics calendar of current matches for TI 2022."""
    calendar = Calendar()

    matches = get_matches()
    for match in matches:
        e = Event()
        e.name = f"{match['team_left']} vs. {match['team_right']}"
        e.begin = match['date']
        e.description = match['description']
        e.duration = timedelta(hours=2, minutes=30)
        calendar.events.add(e)

    with open('ti-calendar.ics', 'w') as f:
        f.writelines(calendar.serialize_iter())

An event in the ti-calendar.ics file will look like this:

BEGIN:VEVENT

DESCRIPTION:Last Chance Qualifier

DURATION:PT2H30M

DTSTART:20221008T065000Z

SUMMARY:Wildcard Gaming vs. Virtus.pro

UID:77371b63-79ab-4d8f-a927-39164b1cfa45@7737.org

END:VEVENT

Now you can head to your favourite calendar and navigate to the Import/Export area. Here you can just upload the ti-calendar.ics file and you will see something like this:

Next Steps

If you have more time, you can:

  • Use Google Calendar API to automate the calendar import
  • Set up a batch file and schedule a task to run the script once a day

By Zeynep Bicer
Published
Categorized as blog

Leave a comment

Your email address will not be published. Required fields are marked *