Migrating from GitLab Pages to Vercel

The process of migrating a personal Hugo blog, built using {blogdown}, from GitLab Pages to Vercel.
GitLab
Vercel
{blogdown}
Published

10 Nov 2024 12:00

I recently migrated this blog from GitLab Pages to Vercel. There were two main reasons for the move:

  1. The blog was taking too long to build on GitLab Pages, which hindered efficient updates and added unnecessary delays to my workflow. Admittedly, this was partially my own doing since my build process was far too complicated.
  2. I want to have greater control over redirects (specifically the ability to redirect URLs that didn’t end in a slash to ones that did, which was apparently important for SEO purposes).

🚀 TL;DR If you don’t have the patience to wade through the details then skip down to the end for a brief summary.

Pages with Redirects

The second of these was the real driver. I was motivated by a periodic report from Google Search Console which indicated that a number of my pages were not being indexed because they were redirects.

Upon further investigation I found that Google was referring to redirects that enforced trailing slashes on my URLs. For example, redirecting from https://datawookie.dev/blog (without a trailing slash) to https://datawookie.dev/blog/. These normalisation redirects appear to originate from the GitLab Pages server itself. These redirects have apparently been there for a while (at least since August 2022) and are fairly consistent in number.

What’s Happening on GitLab Pages?

I used curl -v to better understand what happens to an URL without a trailing slash.

curl -v https://datawookie.dev/blog
* Host datawookie.dev:443 was resolved.
* Connected to datawookie.dev (35.185.44.232) port 443
> GET /blog HTTP/2
> Host: datawookie.dev
> User-Agent: curl/8.5.0
> Accept: */*
> 
< HTTP/2 302 
< content-type: text/html; charset=utf-8
< location: //datawookie.dev/blog/
< permissions-policy: interest-cohort=()
< vary: Origin
< content-length: 57
< date: Tue, 12 Nov 2024 06:42:34 GMT
< 
<a href="//datawookie.dev/blog/">Found</a>.

The curl output has been edited to remove some irrelevant content. It’s apparent that the server generated a 302 Found redirect. The location: header gives the target URL for the redirect (which simply adds the trailing slash).

It appears that requests to GitLab Pages are proxied via NGINX, so it’s likely that this behaviour is being introduced by NGINX. A cursory bit of additional research supports this. Not important, but interesting.

Preparation

To optimise the repository for the migration and streamline builds, I made the following changes to the repository:

  1. I previously had a Hugo theme as a sub-module. I moved it directly into the repository.
  2. I was processing some .Rmd files on each build via GitLab CI. For these files I did not include the corresponding .md file in the repository since it was generated in CI. This process significantly slowed down the site build time. I am now including all of the .md files and not rebuilding in CI. This means that some posts (like Your Life in Weeks) are no longer getting rebuilt daily. I’ll fix that sometime. Maybe.
  3. I upgraded to Hugo v0.138.0 and had to rename config.yaml to hugo.yaml.

With the repository updated and streamlined, I was ready to set up the deployment on Vercel.

Vercel Preliminaries

(Not) Using the API

I initially tried pushing my site to Hugo using the API. However I quickly found that this would not be ideal since there was a daily limit on with a free (personal) Vercel plan. It made more sense to let Vercel itself take care of the deployment.

Add a New Project

On the Vercel dashboard press the button and choose the Project option.

Now, if you have not already done so, select a Git provider. You can choose between GitHub, GitLab and Bitbucket. Once you have made your selection and authenticated you should see a list of repositories. Choose the one for your site and then press the button.

Next you’ll need to fill out the New Project details.

  1. Give it a suitable name.
  2. Click the Framework Preset dropdown and find Hugo.
  3. You can specify a build command (although the default should be 100% fine).
  4. The default Hugo version on Vercel is not current. If you want a specific version then set the HUGO_VERSION environment variable. Add in any other environment variables that you might need for the build.
  5. Press the big button.

A build will start immediately and you can follow the progress in the Build Logs tab. This is probably worth doing at least on your first build.

When your build is complete you can click on the big button and be taken to your freshly deployed site. You’ll probably notice that it has a rather eccentric URL. Fear not, we’ll fix that shortly.

Create a Vercel Configuration File

I like to have as much configuration content in my repository as possible. At this point I created a vercel.json file in the root of the repository with the following content.

{
  "build": {
    "env": {
      "HUGO_VERSION": "0.138.0"
    }
}

This just duplicates the environment variable that we set via the dashboard. You can go back and remove it from the dashboard because it will be picked up from the file for future builds.

Update DNS

Once you have a working production deployment on Vercel and you’re happy that everything is working properly you can then point your DNS at it.

Find the Domains tab under Setting. Specify the required domain.

After you press the button you’ll be asked to set up an A record with your DNS provider. Make the change and wait for it to propagate across the DNS servers.

You can also create redirects from other sub-domains here. For example, I have my blog hosted at https://datawookie.dev and https://www.datawookie.dev redirects to it.

Redirects

Finally we’re at the point where we can address the redirect issue. This can be done by adding a "redirects" section into the vercel.json configuration file.

{
  "buildCommand": "hugo --gc -b https://$VERCEL_PROJECT_PRODUCTION_URL",
  "build": {
    "env": {
      "HUGO_VERSION": "0.138.0"
    }
  },
  "trailingSlash": true,
  "redirects": [
    {
      "source": "/(.*[^/])$",
      "destination": "/$1/",
      "permanent": true
    }
  ]
}

Let’s test it.

curl -v https://datawookie.dev/blog
* Host datawookie.dev:443 was resolved.
* Connected to datawookie.dev (76.76.21.21) port 443
> GET /blog HTTP/2
> Host: datawookie.dev
> User-Agent: curl/8.5.0
> Accept: */*
> 
< HTTP/2 308 
< cache-control: public, max-age=0, must-revalidate
< content-type: text/plain
< date: Wed, 13 Nov 2024 17:27:28 GMT
< location: /blog/
< refresh: 0;url=/blog/
< server: Vercel
< strict-transport-security: max-age=63072000
< x-vercel-id: lhr1::8kbch-1731518848396-54e2be950a4e
< 
Redirecting...

Comparing this to the same output from earlier we can see that we are now getting a 308 Permanent Redirect response, which is what we were after!

🚀 TL;DR Migrating from GitLab Pages to Vercel

I benefited from the move as a result of:

  • faster builds with a simpler process;
  • more flexibility for managing production and preview deployments; and
  • improved SEO through proper redirect handling.

The key components of the migration were:

  1. Preparing the Repository
    • Moved the Hugo theme into the repository.
    • Included pre-generated .md files to avoid rebuilding during CI.
    • Upgraded to Hugo v0.138.0 and adjusted configuration. Not required.
  2. Setting Up Vercel
    • Used the Vercel dashboard to add the project, configure build settings, and deploy.
    • Added a vercel.json file to manage settings like Hugo version and redirects.
  3. Configuring Base URLs
    • Leveraged Vercel environment variables to dynamically set the base URL for production and preview deployments.
  4. Handling Redirects
    • Added trailing slash redirects in vercel.json to fix SEO issues and align with Vercel’s redirect policies.
  5. Updating DNS
    • Pointed custom domain to Vercel and configured subdomain redirects.

Some things to check afterwards:

  • Ensure that RSS feed still works. I checked that https://datawookie.dev/blog/index.xml still had relevant content.
  • Ensure that social <meta> tags still contain appropriate values. I used validator tools for Twitter, LinkedIn and Open Graph.