Andrew B. Collier / @datawookie


Link to CV.


SSH Tunnel: Local Port Forwarding

A tunnel with large yellow earth-moving equipment.

SSH tunnels are a powerful and secure method for transmitting data over potentially unsecured networks. They allow users to establish an encrypted connection between their local machine and a remote server, providing a secure and private pathway for data. An SSH tunnel will allow a service running on a remote machine to appear as if it is running on a local machine. This is also known as port forwarding.

Read More →

Static Redirects on Vercel

Moored boats in an art deco style.

A redirect is a rule which sends users to a different URL than the one they requested. They are most commonly used to ensure that browsers still get to the correct page after it has been moved to a new URL.

If you have a relatively small number of redirects and don’t need to do anything too fancy then static (or “configuration”) redirects are a good option. Static redirects are configured on Vercel by adding entries to the vercel.json configuration file. There’s just one major snag: you can only create 1024 redirects using this mechanism.

Read More →

Batch Resolving Merge Conflicts

A surrealistic image of the confluence between two rivers.

Sometimes when you run git merge you will be confronted with a huge load of merge conflicts. However, if you are lucky there might be a clear rule which you can apply to each of those conflicts, either

  • accept current change (change on current branch or ours) or
  • accept incoming change (incoming change from other branch or theirs).

In this case you can save yourself a lot of time and effort by specifying a particular merge strategy option.

Read More →

Externalise CSS

Externalise CSS

By default Gatsby will embed CSS into the <head> of each HTML page. This is not ideal. In this post I take a look at how to move that CSS into an external file and how the contents of that file can be optimised to remove unused CSS.

Read More →

Dynamic Routing

Month of Gatsby
Dynamic Routing with Gatsby

Suppose that you want to make your site routing a little more flexible. For example, rather than just going straight to a 404 page if the path is not found, you might want to try and guess an appropriate (and valid!) path. This is where dynamic routing comes into play.

Read More →

Custom 404 Page

Month of Gatsby
Custom 404 Page

Setting up a custom 404 page can add something special to your site. It provides you with the opportunity to do something memorable in the unfortunate event that a user asks for an unknown page.

Read More →

Cookies & Headers from Selenium

Cookies & Headers from Selenium

One of my standard approaches to scraping content from a dynamic website is to diagnose the API behind the site and then use it to retrieve data directly. This means that I can make efficient HTTP requests using the requests package and I don’t need to worry about all of the complexity around scraping with Selenium. However, it’s often the case that the API requests require a collection of cookies and headers, and those need to be gathered using Selenium.

Read More →

Adding robots.txt to a Gatsby Site

Adding robots.txt to a Gatsby Site

There are a couple files which can have an impact on the SEO performance of a site: (1) a sitemap and (2) a robots.txt. In a previous post we set up a sitemap which includes only the canonical pages on the site. In this post we’ll add a robots.txt.

A Gatsby site will not have a robots.txt file by default. There’s a handy package which makes it simple though. We’ll take a look at how to add it to the site and a couple of ways to configure it too.

Read More →

Update Sitemap for Canonical Pages

Update Sitemap for canonical pages.

The principal purpose of a sitemap file is to inform search engines about the pages on a website that are available for crawling. It provides a list of URLs along with additional metadata about each URL to help search engines more intelligently crawl the site. If there are multiple page versions on a site then the sitemap should include only the canonical versions of those pages.

Read More →