SSH Tunnel from Docker

I’m building a crawler which I’m going to wrap up in a Docker image. The crawler writes data to a remote MySQL database. However, there’s a catch: the database connection is via an SSH tunnel. Another wrinkle: the crawler is going to be run on ECS, so the whole thing (including setting up the SSH tunnel) needs to be baked into the Docker image.

This post illustrates the process of connecting to a remote MySQL data via a SSH tunnel from Docker. I’m not sure how secure this is. And there are probably better ways to do this. But it’s a start and it works!

SSH Keypair

First we’ll need to generate a new SSH keypair.

ssh-keygen -N "" -t rsa -f id_rsa

This will not prompt for a passphrase and will generate two files:

  • id_rsa (private key) and
  • id_rsa.pub (public key).

Copy the contents of the public key, id_rsa.pub in ~/.ssh/authorized_keys on the remote host. 🚨 Make sure that the whole public key is on a single line.

Dockerfile

Next we’ll set up the Dockerfile.

FROM ubuntu:20.04

ARG PRIVATE_KEY

RUN apt-get update -qq && \
    apt-get install -y -qq openssh-client mysql-client && \
    rm -rf /var/lib/apt/lists/*

RUN mkdir ~/.ssh && \
  echo "Host *" > ~/.ssh/config && \
  echo "  StrictHostKeyChecking accept-new" >> ~/.ssh/config && \
  echo "  ControlMaster auto" >> ~/.ssh/config && \
  echo "  ControlPath ~/.ssh/%r@%h:%p" >> ~/.ssh/config

COPY $PRIVATE_KEY /root/.ssh/id_rsa

COPY tunnel-mysql.sh .

CMD ./tunnel-mysql.sh

This does the following:

  1. copies an SSH private key across onto the image;
  2. configures SSH not to prompt for confirmation when connecting to a new host; and
  3. copies a BASH script onto the image.

Script

Now for the BASH script.

#!/bin/bash

echo -n "Creating SSH tunnel to $HOST... "
ssh -4 -q -N -f -T -M -L 3306:127.0.0.1:3306 $HOST
echo "Done!"

export MYSQL_PWD=$PASSWORD
mysql -e 'SHOW DATABASES;' --user $USERNAME -h 127.0.0.1

echo -n "Closing SSH tunnel... "
ssh -q -T -O "exit" $HOST
echo "Done!"

This sets up an SSH tunnel to the remote host, connects to the MySQL database on the remote host and executes a simple SQL query, then closes the SSH tunnel.

There are an awful lot of ssh options being used. Let’s quickly unpack those:

  • -4 — only use IPv4 addresses
  • -f — run in background
  • -L 3306:127.0.0.1:3306 — bind local port to remote port
  • -M — run in master mode
  • -N — don’t run a remote command
  • -q — run in quiet mode and
  • -T — don’t allocate a pseudo-terminal.

The SQL query is just a placeholder. Whatever database interactions you need to do would go here. So, in my case, this is where the crawler would kick in.

Building & Running

Let’s build the image, passing id_rsa as the value for the PRIVATE_KEY specified in the Dockerfile. This private key is going to be baked into the image. 💡 It’s equally possible to provide the private key at run time, however, I ultimately opted to supply it at build time.

docker build --build-arg PRIVATE_KEY=id_rsa -t docker-ssh-tunnel .

Once it’s built, run it.

docker run --rm -t --env-file .env docker-ssh-tunnel

We’re passing through some environment variables from an .env file which looks like this (all values fictitious!):

HOST=wookie@63.129.24.53
USERNAME=wookie
PASSWORD=04cmRXCPJ111coQpuqmHH6Uc

And the fruits of our labour:

Creating SSH tunnel to wookie@63.129.24.53... Done!
+---------------------+
| Database            |
+---------------------+
| information_schema  |
| mysql               |
| performance_schema  |
| sys                 |
+---------------------+
Closing SSH tunnel... Done!

Nice!

Automation

The final component is wrapping this up so that it will build using GitLab CI. Here’s the content of .gitlab-ci.yml:

stages:
  - build

variables:
  IMAGE_NAME: docker-ssh-tunnel
  TAG_LATEST: $CI_REGISTRY_IMAGE/$IMAGE_NAME:latest
  DOCKER_TLS_CERTDIR: ""

build:
  image: docker:stable
  stage: build
  only:
    - master
  services:
    - docker:dind
  script:
  - cp $PRIVATE_KEY id_rsa
  - docker build --build-arg PRIVATE_KEY=id_rsa -t $TAG_LATEST .
  - docker login -u gitlab-ci-token -p $CI_BUILD_TOKEN $CI_REGISTRY
  - docker push $TAG_LATEST

The content of id_rsa is stored in a CI/CD file variable called PRIVATE_KEY. This file is copied across into the Docker build context before the image is built.

Private key stored as a CI/CD environment variable on GitLab.

So now the image is stored in a registry accessible from ECS. We can set environment variables on the ECS task to hold the values of HOST, USERNAME and PASSWORD.

🚨 One important caveat to this approach is that anybody who has access to the Docker image also has access to the SSH private key. You can (and should) mitigate the risk by (i) ensuring that the image is stored in a secure, private registry and (ii) using non-trivial database credentials.

If you're doing this on a Mac then you might need to replace occurrences of 127.0.0.1 with host.docker.internal.