Uploading CSV to MySQL

I occasionally need to upload the contents of a CSV file to a MySQL database. It happens sufficiently infrequently that I need to remind myself how it works each time. Hopefully this will make it easier next time around.

The Data.

Suppose that you have a file, prices.csv, that looks like this:

"time","product","price"
2020-06-03 22:33:39,"Basic T-Shirt",299
2020-07-22 21:32:21,"Pique Polo",429
2020-04-07 05:38:17,"COUNTRY ROAD Slub Frill T-Shirt",299
2020-04-23 03:54:09,"Caribbean Tan Mousse Gradual A 150ml",95.95
2020-04-01 05:01:29,"Pulled Pork Shoulder 500g",79.99
2020-05-15 12:26:48,"Back To Work Blazer",2299
2020-07-13 06:28:27,"Funnel Neck Cardigan",1499
2020-06-03 17:07:50,"Extra Depth 180TC Cotton Blend Fitted Sheet",279
2020-07-28 02:00:29,"Clover Seal Full Cream Fresh Milk 1l",17.99

Create Table

First we need to create a table.

CREATE TABLE price (
  `time` datetime DEFAULT NULL,
  `product` varchar(512) DEFAULT NULL,
  `price` float DEFAULT NULL
);

Upload Data

In order to upload a file we need to launch the mysql client with the --local-infile flag.

mysql --local-infile

We also need to configure the server to allow file upload.

SET GLOBAL local_infile=1;

Now we’re ready for the upload.

LOAD DATA LOCAL INFILE 'prices.csv'
INTO TABLE price
FIELDS TERMINATED BY ','
ENCLOSED BY '\"'
IGNORE 1 LINES;

The resulting table will be perfectly functional and will contain the data from the CSV file. However, there’s one important ingredient missing: a primary key.

Create Table with Primary Key

Let’s drop the initial version of the table and create a new version which has a primary key.

DROP TABLE price;

CREATE TABLE price (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `time` datetime DEFAULT NULL,
  `product` varchar(512) DEFAULT NULL,
  `price` float DEFAULT NULL,
  PRIMARY KEY (`id`)
);

Upload Data Again

Now we need to just specify which columns we’re filling from the CSV file.

LOAD DATA LOCAL INFILE 'prices.csv'
INTO TABLE price
FIELDS TERMINATED BY ','
ENCLOSED BY '\"'
IGNORE 1 LINES
(time, product, price);

The primary key column, id, will be automatically populated.