Blog Posts by Andrew B. Collier / @datawookie


Weekly Digest & Annual Review

A large library with vaulted ceiling.

A quick review of the year.

  • I published 55 posts (including this one).
  • I spent a lot of time working with GatsbyJS for one of my clients. At first I was quite out of my depth, but I slowly figured out more or less how it works. I documented some of my learning in a series of posts.
  • My most popular post is still about Shared Memory & Docker. The runner up looks at how to Install GitLab Runner with Docker.
  • I spent some time compiling data on kayak specifications in the hope of producing a definitive table. It’s a work in progress but it’s already getting quite a lot of interest.

Now onto a few interesting articles from this week, mostly announcements of new versions.

Read More →

Chrome & ChromeDriver in Docker

A whale leaping out of the ocean in the style of Vincent van Gogh.

When I containerised Selenium crawlers in the past I normally used a remote driver connection from the crawler to Selenium, running a separate Docker image with Selenium and accessing it via port 4444. This has proven to be a robust design. However, it does mean two containers rather than just one, leading to a higher maintenance burden and elevated resource requirements.

What about simply embedding Chrome and ChromeDriver directly into the crawler image? It requires a bit more work, but it’s worth it. The critical point is ensuring compatible versions of Chrome and ChromeDriver.

Read More →

SSH Tunnel: Dynamic Port Forwarding

SSH Tunnel: Dynamic Port Forwarding

With a local or remote SSH tunnel the ports on both the local and remote machines must be specified at the time of creating the tunnel. But what if you need something more flexible? That’s where Dynamic Port Forwarding comes into play.

Read More →

SSH Tunnel: Remote Port Forwarding

A tunnel with large yellow earth-moving equipment.

Local and remote SSH tunnels serve the same fundamental purpose: they make it possible to securely send data across an unsecured network. The implementation details are subtly different though. A local SSH tunnel acts like a secure bridge from a local machine to a remote server. It’s ideal for accessing services on the remote server which aren’t publicly exposed. Conversely, a remote SSH tunnel reverses this direction, forwarding traffic from the remote server back to a local machine (or another machine).

The critical distinction between the two is the direction of the connection between the remote and local machines.

Read More →

SSH Tunnel: Local Port Forwarding

A tunnel with large yellow earth-moving equipment.

SSH tunnels are a powerful and secure method for transmitting data over potentially unsecured networks. They allow users to establish an encrypted connection between their local machine and a remote server, providing a secure and private pathway for data. An SSH tunnel will allow a service running on a remote machine to appear as if it is running on a local machine. This is also known as port forwarding.

Read More →

Static Redirects on Vercel

Moored boats in an art deco style.

A redirect is a rule which sends users to a different URL than the one they requested. They are most commonly used to ensure that browsers still get to the correct page after it has been moved to a new URL.

If you have a relatively small number of redirects and don’t need to do anything too fancy then static (or “configuration”) redirects are a good option. Static redirects are configured on Vercel by adding entries to the vercel.json configuration file. There’s just one major snag: you can only create 1024 redirects using this mechanism.

Read More →

Batch Resolving Merge Conflicts

A surrealistic image of the confluence between two rivers.

Sometimes when you run git merge you will be confronted with a huge load of merge conflicts. However, if you are lucky there might be a clear rule which you can apply to each of those conflicts, either

  • accept current change (change on current branch or ours) or
  • accept incoming change (incoming change from other branch or theirs).

In this case you can save yourself a lot of time and effort by specifying a particular merge strategy option.

Read More →

Externalise CSS

Externalise CSS

By default Gatsby will embed CSS into the <head> of each HTML page. This is not ideal. In this post I take a look at how to move that CSS into an external file and how the contents of that file can be optimised to remove unused CSS.

Read More →

Dynamic Routing

Month of Gatsby
Dynamic Routing with Gatsby

Suppose that you want to make your site routing a little more flexible. For example, rather than just going straight to a 404 page if the path is not found, you might want to try and guess an appropriate (and valid!) path. This is where dynamic routing comes into play.

Read More →

Custom 404 Page

Month of Gatsby
Custom 404 Page

Setting up a custom 404 page can add something special to your site. It provides you with the opportunity to do something memorable in the unfortunate event that a user asks for an unknown page.

Read More →