Resurrecting MySQL into PostgreSQL with PGLoader

I’ve been hosting a MySQL database on a DigitalOcean server for a few of years. The project has been on hold for a while. Entropy kicked in and the server became unreachable. Fortunately I was still able to access the server via a recovery console to export the database using mysqldump and download the resulting SQL dump file.

Now I want to resurrect the database locally but I also want to migrate it to PostgreSQL.

MySQL

I’m going to spin up a local MySQL server and, for consistency, it’d be preferable to use the same version that was running on the original server.

mysql --version
mysql  Ver 14.14 Distrib 5.7.32, for Linux (x86_64)

Aha! So I need MySQL 5.7.32.

MySQL Docker Container

Pull a Docker image which matches the version of MySQL.

docker pull mysql:5.7.32

Set a password for the MySQL root user.

MYSQL_ROOT_PASSWORD=PhwPzjP39KmMI73GTXz98Z3n

Then create a Docker container running MySQL.

docker run -d --rm --name mysql -e MYSQL_ROOT_PASSWORD=$MYSQL_ROOT_PASSWORD mysql:5.7.32

Connecting to the MySQL Container

Connect to MySQL using the mysql client in the running container.

docker exec -it mysql mysql -uroot -p

Enter the password at the prompt and you’re in.

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 4
Server version: 5.7.32 MySQL Community Server (GPL)

Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

If you have the MySQL client installed on your machine then you can also connect to MySQL directly. First you’ll need to find out the effective IP address of the container.

docker inspect mysql | grep IPAddress
"IPAddress": "172.17.0.2"

So MySQL is at 172.17.0.2 and using the default port (3306).

mysql -h 172.17.0.2 -P 3306 -uroot -p

You can also use whatever other database client you prefer. My personal preference is DBeaver.

Create a Source Database

We’ll need a database into which the SQL dump will be loaded.

mysql> create database backup;

Populate the Database

Now, back in the shell, load the SQL dump (in file backup.sql), specifying the name of the database which you’ve just created.

docker exec -i mysql mysql -uroot -p$MYSQL_ROOT_PASSWORD backup <backup.sql

Depending on the size of the dump this may take a little while to run. Ensure that you have sufficient resources to accommodate the data.

Check Data on MySQL

Let’s take a look at the resulting database. Back into the mysql client.

mysql> use backup;
mysql> show tables;
+----------------------------------+
| Tables_in_backup                 |
+----------------------------------+
| athlete                          |
| athlete_note                     |
| club                             |
| club_pattern                     |
| club_type                        |
| event                            |
| license                          |
| province                         |
| race                             |
| race_distance                    |
| race_tag                         |
| race_tags                        |
| result                           |
| staging                          |
| substitution                     |
| substitution_contact             |
+----------------------------------+
16 rows in set (0.00 sec)

Those are all of the tables I was expecting to see. Let’s crack one of them open.

mysql> select id, name from athlete limit 10;
+----------------------------------+------------------------+
| id                               | name                   |
+----------------------------------+------------------------+
| 00006f662c4e45c2a0285603865a8e3c | Derrick Majozi         |
| 00009eab3ba64cefa6fae639ed413b6e | S Mbatha               |
| 0000dc7e90014192baecb1ff6da3b210 | Jonathan van den Bergh |
| 0001051eebdb4d3b9c74c37d62fc8207 | Graham Swartz          |
| 000143662d83435e9d5dce7944f62aa7 | Darren Moosa           |
| 0001650aa646420590bb64c6b8b82bb8 | Rashaad Fortune        |
| 000184586e81442db9051ca432c8ac6c | Jimmy Cochin           |
| 0001b2104a6244729dd6fe4fc57e2703 | Thomas Dalais          |
| 000201d90ae84595b7cd4e629890c803 | Never Mutzyula         |
| 00020b43f0354d97ad1ebed8e15ae3d8 | R Mathebula            |
+----------------------------------+------------------------+
10 rows in set (0.00 sec)

Looks good.

PostgreSQL

The source MySQL database is ready. Now we need to prepare the target PostgreSQL database.

PostgreSQL Docker Container

Again we’ll pull a specific image (you can choose a different version if you prefer).

docker pull postgres:9.6.16

Choose a password for the postgres user.

POSTGRES_PASSWORD=tuOtMvG219aYQJrCZcPe9wIC

And fire up a container.

docker run -d --rm --name postgres -e POSTGRES_PASSWORD=$POSTGRES_PASSWORD postgres:9.6.16

Connecting to the PostgreSQL Container

Connect to the PostgreSQL server using the psql client in the running container.

docker exec -it postgres psql -U postgres
psql (9.6.16)
Type "help" for help.

postgres=#

Alternatively, you can use psql (or any other suitable database client) installed directly on your machine. You’ll need to find out the effective IP address for the PostgreSQL server first though.

docker inspect postgres | grep IPAddress
"IPAddress": "172.17.0.3"

The PostgreSQL server is effectively at 172.17.0.3.

psql -h 172.17.0.3 -U postgres

Create a Target Database

Now we need to create a database on the PostgreSQL server which will act as the target for the migration. Use the createdb utility in the running container to do this.

docker exec -it postgres createdb -U postgres backup

Both the source and target are now in place. We’re ready to migrate.

Migration

PGLoader is a tool for moving data to PostgreSQL developed by Dimitri Fontaine, making the process both simple and robust.

Installing PGLoader

Either install it directly.

sudo apt update
sudo apt install pgloader

Or use Docker.

docker pull dimitri/pgloader

Check on the version.

pgloader --version
pgloader version "3.6.1"
compiled with SBCL 1.4.15.debian

Migrate Data from MySQL to PostgreSQL

We need to execute pgloader, providing the details of the source and target databases. The credentials, IP address and database for the source and target are specified by detailed mysql and pgsql URLs.

pgloader mysql://root:$MYSQL_ROOT_PASSWORD@172.17.0.2/backup \
         pgsql://postgres:$POSTGRES_PASSWORD@172.17.0.3/backup

When pgloader finishes, it presents a summary table of the migration process. Check the errors column to ensure that there were no problems.

Check Data on PostgreSQL

Finally let’s verify that the migrated data are intact. Connect to the PostgreSQL server.

psql -h 172.17.0.3 -U postgres

Select the backup database and list its tables.

postgres=# \c backup
backup=# \dt
                      List of relations
 Schema |               Name               | Type  |  Owner   
--------+----------------------------------+-------+----------
 backup | athlete                          | table | postgres
 backup | athlete_note                     | table | postgres
 backup | club                             | table | postgres
 backup | club_pattern                     | table | postgres
 backup | club_type                        | table | postgres
 backup | event                            | table | postgres
 backup | license                          | table | postgres
 backup | province                         | table | postgres
 backup | race                             | table | postgres
 backup | race_distance                    | table | postgres
 backup | race_tag                         | table | postgres
 backup | race_tags                        | table | postgres
 backup | result                           | table | postgres
 backup | staging                          | table | postgres
 backup | substitution                     | table | postgres
 backup | substitution_contact             | table | postgres
(16 rows)

Matches what we had on MySQL. What about the contents of a table?

backup=# select id, name from athlete limit 10;
                id                |          name          
----------------------------------+------------------------
 00006f662c4e45c2a0285603865a8e3c | Derrick Majozi
 00009eab3ba64cefa6fae639ed413b6e | S Mbatha
 0000dc7e90014192baecb1ff6da3b210 | Jonathan van den Bergh
 0001051eebdb4d3b9c74c37d62fc8207 | Graham Swartz
 000143662d83435e9d5dce7944f62aa7 | Darren Moosa
 0001650aa646420590bb64c6b8b82bb8 | Rashaad Fortune
 000184586e81442db9051ca432c8ac6c | Jimmy Cochin
 0001b2104a6244729dd6fe4fc57e2703 | Thomas Dalais
 000201d90ae84595b7cd4e629890c803 | Never Mutzyula
 00020b43f0354d97ad1ebed8e15ae3d8 | R Mathebula
(10 rows)

Looks like the data are intact. Perfect! That was a remarkably painless process thanks to an amazing set of Open Source tools.

PGLoader is not limited to migrating from MySQL. It handles CSV and fixed width files, MaxMind Geolite, dBase and SQLite too. A very versatile tool indeed. Learn more in the excellent documentation.