I needed an offline copy of an economic calendar with all of the major international economic events. After grubbing around the internet I found the Economic Calendar on Myfxbook which had everything that I needed.
Here’s a screenshot of that calendar.
This seemed like a good candidate for some simple web scraping. However, the big orange button at the bottom was an indication of some minor challenges afoot. If you wanted to get the full calendar then you’d need to press this button repeatedly to retrieve additional pages of data. And, for the purpose of web scraping, this would need to be automated.
In addition there are a couple of modal dialogs that pop up on the page when you first visit the site. I’d like to get those out of the way too.
Choice of Tools
I had to choose between either (1) diagnosing the API behind the site or (2) running a browser tool to automate interaction with the site.
After taking a look at the network requests going back and forth between my browser and the server I concluded that the second approach would be best. My preferred tools for this are either Selenium or Playwright. Recently I have been leaning towards the latter.
Implementation
The scraper script (implemented in Python) ultimately consisted of a few components:
- spin up Playwright and launch a Chromium instance;
- navigate to the calendar URL;
- deal with the various popups on initial page launch (probably not strictly necessary but good to emulate real user interaction);
- keep on smashing (with liberal pauses) the button until all data retrieved; and
- parse the resulting table, then dump to CSV.
After developing and testing I deployed this as a job that’s run daily via GitLab CI/CD.
Results
The resulting CSV files contains all of the columns shown in the screenshot above. Here, for example, I load the file into R and display the first 20 records.
calendar <- read.csv("calendar.csv") |>
rename(iso = currency) |>
mutate(date = strptime(date, "%Y-%m-%d %H:%M:%S")) |>
select(-previous, -consensus, -actual)
head(calendar, n = 20)
date iso event impact
1 2024-10-02 00:00:00 CNY National Day Golden Week None
2 2024-10-02 00:01:00 AUD CoreLogic Dwelling Prices MoM (Sep) None
3 2024-10-02 05:00:00 JPY Consumer Confidence (Sep) High
4 2024-10-02 07:00:00 EUR Unemployment Change (Sep) High
5 2024-10-02 07:00:00 EUR Tourist Arrivals YoY (Aug) Low
6 2024-10-02 07:15:00 EUR ECB Guindos Speech High
7 2024-10-02 08:00:00 EUR Unemployment Rate (Aug) High
8 2024-10-02 09:00:00 EUR Retail Sales YoY (Aug) Low
9 2024-10-02 09:00:00 EUR Unemployment Rate (Aug) Low
10 2024-10-02 09:00:00 EUR Unemployment Rate (Aug) High
11 2024-10-02 09:00:00 GBP 5-Year Treasury Gilt Auction Low
12 2024-10-02 09:30:00 EUR 10-Year Bund Auction Medium
13 2024-10-02 09:30:00 EUR ECB Lane Speech Low
14 2024-10-02 09:45:00 EUR ECB Buch Speech Low
15 2024-10-02 10:00:00 EUR Unemployment Rate (Sep) Low
16 2024-10-02 10:10:00 EUR 3-Month Bill Auction Low
17 2024-10-02 10:10:00 EUR 6-Month Bill Auction Low
18 2024-10-02 10:30:00 EUR Budget Balance (Aug) Low
19 2024-10-02 11:00:00 USD MBA Mortgage Refinance Index (Sep/27) Low
20 2024-10-02 11:00:00 USD MBA Purchase Index (Sep/27) Low
In the interests of brevity I have omitted the previous
, consensus
, and actual
columns, however, these are included in the CSV data. You can slice and dice these data as required. For example, here are the high impact events during the first two trading days of November 2024.
calendar |>
filter(
impact == "High",
date >= "2024-11-01",
date < "2024-11-05"
)
date iso event impact
1 2024-11-01 01:45:00 CNY Caixin Manufacturing PMI (Oct) High
2 2024-11-01 08:00:00 EUR Unemployment Rate (Oct) High
3 2024-11-01 08:30:00 CHF procure.ch Manufacturing PMI (Oct) High
4 2024-11-01 09:00:00 EUR S&P Global Manufacturing PMI (Oct) High
5 2024-11-01 09:30:00 GBP S&P Global Manufacturing PMI (Oct) High
6 2024-11-01 12:30:00 USD Nonfarm Payrolls Private (Oct) High
7 2024-11-01 12:30:00 USD U-6 Unemployment Rate High
8 2024-11-01 12:30:00 USD Non Farm Payrolls (Oct) High
9 2024-11-01 12:30:00 USD Unemployment Rate (Oct) High
10 2024-11-01 13:30:00 CAD S&P Global Manufacturing PMI (Oct) High
11 2024-11-01 13:45:00 USD S&P Global Manufacturing PMI (Oct) High
12 2024-11-01 14:00:00 USD ISM Manufacturing PMI (Oct) High
13 2024-11-04 08:15:00 EUR HCOB Manufacturing PMI (Oct) High
14 2024-11-04 08:45:00 EUR HCOB Manufacturing PMI (Oct) High
15 2024-11-04 08:50:00 EUR HCOB Manufacturing PMI (Oct) High
16 2024-11-04 08:55:00 EUR HCOB Manufacturing PMI (Oct) High
17 2024-11-04 09:00:00 EUR HCOB Manufacturing PMI (Oct) High
18 2024-11-04 22:00:00 AUD Judo Bank Services PMI (Oct) High
The CSV file with these data can be downloaded here and will be updated daily.
Related
If you’re interested in an earnings calendar, take a look at this post.