ChromeDriver in GitLab CI Pipeline

You might need to run a Selenium crawler in a GitLab CI pipeline. Here’s how to get that set up.

The Crawler

Well, it’s not much of a crawler but it illustrates the setup. This is the script that I want to run via GitLab CI.

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

options = Options()

options.add_argument("--no-sandbox")
options.add_argument("--headless")
options.add_argument("--window-size=1920,1080")

driver = webdriver.Chrome(options=options)
driver.get('http://www.example.com')

print(driver.page_source)

There are a couple of critical options specified.

  • --no-sandbox — This is necessary if you’re going to be launching Chrome as the root user.
  • --headless — Since we’re running Chrome in Docker it needs to be headless.
  • --window-size=1920,1080 — Not strictly necessary but I like to set a specific window size. Superstition!

The CI Pipeline

And here’s the .gitlab-ci.yml file to create the pipeline. We’re installing Chrome and ChromeDriver from Chrome for Testing.

chrome:
  image: python:3.11.4
  stage: test
  before_script:
    - apt-get update -qq -y
    - >
      apt-get install -y wget unzip fonts-liberation libasound2 libatk-bridge2.0-0 libatk1.0-0 libatspi2.0-0
      libcups2 libdbus-1-3 libdrm2 libgbm1 libgtk-4-1 libnspr4 libnss3 libu2f-udev libvulkan1 libxcomposite1
      libxdamage1 libxfixes3 libxkbcommon0 libxrandr2 xdg-utils      
    # Chrome
    - wget -q -O chrome-linux64.zip https://bit.ly/chrome-linux64-121-0-6167-85
    - unzip chrome-linux64.zip
    - rm chrome-linux64.zip
    - mv chrome-linux64 /opt/chrome/
    - ln -s /opt/chrome/chrome /usr/local/bin/
    # Chromedriver
    - wget -q -O chromedriver-linux64.zip https://bit.ly/chromedriver-linux64-121-0-6167-85
    - unzip -j chromedriver-linux64.zip chromedriver-linux64/chromedriver
    - rm chromedriver-linux64.zip
    - mv chromedriver /usr/local/bin/
    - chrome --version
    - chromedriver --version
    - pip3 install selenium
  script:
    - python3 run.py

What’s going on there? Here’s a breakdown of the steps:

  1. Install wget and unzip. We’ll use wget for two downloads and unzip to unpack a ZIP archive. Also install a bunch of dependencies that are required by Chrome.
  2. Download a specific version of Chrome. For brevity I’m using a shortened URL. The full URL is https://edgedl.me.gvt1.com/edgedl/chrome/chrome-for-testing/121.0.6167.85/linux64/chrome-linux64.zip.
  3. Download a specific version of ChromeDriver. The full URL is https://edgedl.me.gvt1.com/edgedl/chrome/chrome-for-testing/121.0.6167.85/linux64/chromedriver-linux64.zip. Unzip and install this directly into the execution path.
  4. Check on the versions of Chrome and ChromeDriver.
  5. Install the Selenium package for Python.
  6. Run the script.

The CI log will show the installed versions of Chrome and ChromeDriver.

$ chrome --version
Google Chrome 121.0.6167.85
$ chromedriver --version
ChromeDriver 121.0.6167.85

And the script will dump the HTML contents from http://www.example.com.