Andrew B. Collier / @datawookie


Link to CV.


{emayili} UTF-8 Filenames & Setting Sender

Banner for the {emayili} package.

Two new features in the {emayili} (0.4.6) package for easily sending emails from R.

Package Setup

If you have not already installed the package, then grab it from CRAN or GitHub.

# From CRAN.
install.packages("emayili")
# From GitHub.
remotes::install_github("datawookie/emayili")

Load the package.

library(emayili)

Check that you have the current version.

packageVersion("emayili")
[1] ‘0.4.6’

Let’s quickly set up an SMTP server. We’ll use SMTP Bucket, which is incredibly convenient for testing.

SMTP_SERVER   = "mail.smtpbucket.com"
SMTP_PORT     = 8025

smtp <- server(host = SMTP_SERVER, port = SMTP_PORT)

UTF-8 Characters in Attachment Filenames

It’s now possible to attach files with names that include non-ASCII characters. Suppose I wanted to send this image (source) of Wenceslao Moreno.

Read More →

Resurrecting MySQL into PostgreSQL with PGLoader

I’ve been hosting a MySQL database on a DigitalOcean server for a few of years. The project has been on hold for a while. Entropy kicked in and the server became unreachable. Fortunately I was still able to access the server via a recovery console to export the database using mysqldump and download the resulting SQL dump file.

Now I want to resurrect the database locally but I also want to migrate it to PostgreSQL.

Read More →

Levies, Tax and the Fuel Price in South Africa

According to the Automobile Association (AA) the fuel price is the sum of four main components:

  • the basic fuel price
  • the general fuel levy
  • the Road Accident Fund (RAF) levy and
  • wholesale and retail margins, distribution and transport costs.

This article suggests that almost 70% of the fuel price in South Africa is due to taxes and levies.

I used data from {saffer} to examine this assertion.

Read More →

Persistent Selenium Sessions

I have a project where I need to have a persistent Selenium session. There’s a script which will leave a browser window open when it exits. When the script runs again it should connect to the same window.

Read More →

SQLAlchemy: Efficient Counting

I have a SQLAlchemy count() query which is being called fairly frequently in my API. The query itself is not terribly inefficient, but it’s being called with sufficient frequency that it has a performance impact.

Read More →

GitLab CI: Services

I needed to have a Redis server available as part of the GitLab CI pipeline for this blog (simply because I wanted to use the {rredis} package). After fiddling around for some time trying to install the redis-server package using apt I discovered that GitLab CI actually provides Redis as a service, which makes the process remarkably easy.

Some details of the “standard” services (Redis, PostgreSQL and MySQL) supported by GitLab CI can be found here:

Read More →

Scrapy Ban Policies with Rotating Proxies

The scrapy-rotating-proxies package makes it simple to use rotating proxies with Scrapy.

One issue that I’ve run into though is that pages which return a 404 error are retried (and the corresponding proxy is marked as dead). This does not make sense to me since if a server returns a 404 error this generally means that the requested page is just not available. It’s not a proxy problem; it’s a URL problem.

Read More →