The International
Dota 2’s The International has been the biggest annual esports tournament since its debut in 2011. For reference, last year’s prize pool was over $40 million, and it’s crowdfunded too! For fans of Dota 2 like myself, this time of the year is comparable to Christmas.
Motivation
This year, the event is being held in Singapore and the structure of the event has changed so that the schedule is spread accross multiple weeks. With the time zone difference and the confusing new schedule, I found myself having to go to liquipedia.net many times a day to check which games are being played and when. As the series are played during night time for my time zone, if I wanted to check the schedule the next day, I would have to see spoilers for who won. Dota 2 have their own schedule page with an option to hide spoilers but it involves too many clicks for my liking.
After all, laziness and impatience are important virtues for a programmer. Naturally, I wanted to write a script that will create an .ics (iCalendar) file for all matches converted to my time zone, which I can import to my personal calendar so that I can view the schedule spoiler-free by swiping to my calendar widget on my phone’s home screen.
Web Scraping
Fortunately, the HTML content of the website I use to check the schedule is structured and predictable. Each match has an info icon associated with it which when clicked opens a small pop-up which contains the information we need: names of teams and the date/time of the match.
There are only a few cases to consider:
- Both teams are known – match time is known
- One or both teams unknown or winner/loser of another match that hasn’t been played yet – match time is known
- Teams known/unknown – match time unknown
We will ignore the cases where the match time is unknown because it won’t be useful to display them on the calendar.
The match pop-up card looks like this:
By inspecting the page for the cases above, we identify some class names for HTML elements that will be used to scrape the match information:
- The match details will always be in the
<div>
element with the class namebrkts-match-info-popup
- Teams (known or TBD) will appear in the
<div>
element with the class name.brkts-popup-header-opponent-{direction}
wheredirection
is eitherleft
orright
. - If the team is known name of the team will be within the
<span>
element with the class namename
- If the team is the winner/loser of another match or simply TBD, that information will be in the
<div>
element with the class namebrkts-opponent-block-literal
- If a team is the winner/loser of another game the information will appear as a string, if it’s TBD there will be a zero-width space character instead
- The date and time of the match come from the
<div>
element with the class nametimer-object
and follows a format (see image above). Note: The time zone in the image is BST because the website is aware of my location, however, when usingrequests.get()
, the time zone defaults to the local time zone of the event which is SGT(UTC+8).
There are 3 pages we’re interested in Last Chance Qualifier, Group Stage and Main Event. These will be the path names at the end of the request URL.
Equipped with this information we can set up the web scraping function as follows:
import requests
from bs4 import BeautifulSoup
from typing import List
from datetime import datetime
import pytz
def get_matches() -> List[dict]:
"""
Get match details for The International 2022.
Returns
-------
List[dict]
Each dictionary describes one match/series
"""
all_matches = []
for pathname in ["Last_Chance_Qualifier", "Group_Stage", "Main_Event"]:
page = requests.get(
f"https://liquipedia.net/dota2/The_International/2022/{pathname}"
)
soup = BeautifulSoup(page.content, "html.parser")
match_details = soup.findAll("div", class_="brkts-popup brkts-match-info-popup")
matches = []
for m in match_details:
teams = []
for direction in ["left", "right"]:
team_info = m.select_one(f".brkts-popup-header-opponent-{direction}")
team = team_info.select_one(".name")
if not team: # Team names unknown/not decided
team = team_info.select_one(".brkts-opponent-block-literal")
team = team.get_text()
if team == "\u200b": # Zero width space character
team = "TBD"
teams.append(team)
team_left, team_right = teams
dt = m.select_one(".timer-object")
if dt:
dt = dt.get_text()
dt = dt.replace(
"SGT", "UTC+0800"
) # Converting time zone part to expected format
dt = datetime.strptime(dt, "%B %d, %Y - %H:%M %Z%z").astimezone(
pytz.timezone("Europe/London")
) # Converting times to BST
match = {
"team_left": team_left,
"team_right": team_right,
"date": dt if dt else None,
"description": (" ").join(pathname.split("_")),
}
if match["date"]: # Don't need to save dateless events
matches.append(match)
if matches:
all_matches += matches
return all_matches
A match from the resulting list will look like this:
{
"team_left": "Team Secret",
"team_right": "Tempest",
"date": "2022-10-08 03:05:00+01:00",
"description": "Last Chance Qualifier"
}
Calendar
Now that we have the match information, we can move on to the iCalendar (using the ics python package).
The ics package has 2 classes we’re interested in: Calendar
and Event
.
We initialise the calendar with calendar = Calendar()
. Then we can create events and add them to the calendar. An ics Event()
has various properties we can populate such as name
, begin
, duration
and description
.
- The datetime of the match will be the
begin
property for each match. - As series don’t have a set end time, we’ll just add 2.5 hours as the
duration
. - The
name
of the event will be of the form Team A vs Team B. - The
description
will be the portion of the event the match is in.
We import our get_matches
function with other required packages and convert the matches to iCalendar events:
from ics import Calendar, Event
from datetime import timedelta
from acquisition import get_matches
def export_calendar():
"""Create an .ics calendar of current matches for TI 2022."""
calendar = Calendar()
matches = get_matches()
for match in matches:
e = Event()
e.name = f"{match['team_left']} vs. {match['team_right']}"
e.begin = match['date']
e.description = match['description']
e.duration = timedelta(hours=2, minutes=30)
calendar.events.add(e)
with open('ti-calendar.ics', 'w') as f:
f.writelines(calendar.serialize_iter())
An event in the ti-calendar.ics
file will look like this:
BEGIN:VEVENT
DESCRIPTION:Last Chance Qualifier
DURATION:PT2H30M
DTSTART:20221008T065000Z
SUMMARY:Wildcard Gaming vs. Virtus.pro
UID:77371b63-79ab-4d8f-a927-39164b1cfa45@7737.org
END:VEVENT
Now you can head to your favourite calendar and navigate to the Import/Export area. Here you can just upload the ti-calendar.ics
file and you will see something like this:
Next Steps
If you have more time, you can:
- Use Google Calendar API to automate the calendar import
- Set up a batch file and schedule a task to run the script once a day