Social links CV.
and a link to my
Public datasets:
Private Security is a big industry in South Africa. Most Private Security companies promise to provide a rapid response to every callout generated by any of their customers. There is a delicate balance between the number of response vehicles and the number of customers (and the frequency of their callouts!), which determines whether or not they are able to honour this promise.
On the one hand, more response vehicles result in lower response times. However, these vehicles are expensive to maintain and staff. Fewer vehicles are more cost effective, but make it difficult to maintain a high level of service.
Read More →Linux has really come a long way. I used to arrive at the podium and hook up my (Linux) laptop with the resigned expectation that there would be some tweaking involved to get it to speak to the projector. However the support for video hardware has evolved massive and nowadays I don’t ever think about this: it just works.
Until it doesn’t.
This week I was speaking at a conference where the video setup was extremely pernickety. It required a resolution of 1280 by 720 at a frequency of 50 Hz. Try and setup that up using the desktop display configuration tools in Ubuntu… it just doesn’t seem to be possible.
Read More →Your data are valuable. If, God forbid, some disaster befalls your database then you should have a plan in place for how to recover your data. In this post I describe a simple strategy for backing up a MySQL database. This might not be the best approach, but it has worked for me.
Read More →I need to deploy Shiny on a Windows machine. I also need to use {checkpoint}
for package management. Using Docker seems to be the only reasonable approach to Shiny on Windows. But how easy would it be to also factor {checkpoint}
into this setup?
Only one reasonable way to find out: give it a try.
Below is the simple Dockerfile
I used. Here are the fundamental components of what it does:
I was inspired by this visualisation, showing the optimal routes (by car) from the geographic centre of the USA to all counties.
Read More →If you have multiple applications accessing OSRM data then it does not make sense for each of those to have a separate copy of the data resident in memory. This is especially true if you’re using a relatively large map, in which case memory consumed by multiple processes might be enormous.
An alternative is to store the map data in shared memory, allowing multiple processes to access a single copy of the data. The official OSRM documentation for using shared memory can be found here. This post gives further details.
Read More →For some time I’ve wanted to recreate the cover art from Joy Division’s Unknown Pleasures album. The visualisation depicts successive pulses from the pulsar PSR B1919+21, discovered by Jocelyn Bell in 1967.
Read More →I’m looking at ways to effectively visualise the splits data for the 2019 edition of the Comrades Marathon. My objectives are to provide:
How long does it take to cross the start line at the Comrades Marathon? If you’re lucky enough to be starting in one of the batches which is close to the front then this might be a matter of seconds to a couple of minutes. But if you’re in a batch closer to the back then this could be anything up to ten or eleven minutes. This is an agonising wait when all you want to do is start running.
Read More →The Comrades Marathon is an epic ultramarathon run each year between Durban and Pietermaritzburg (South Africa).
A few years ago I put together a simple spreadsheet for generating a Comrades Marathon pacing strategy. But the spreadsheet was clunky to use and laborious to maintain. Plus I was frustrated by the crude plots (largely due to my limited spreadsheet proficiency). It seemed like an excellent opportunity to create a Shiny app.
Read More →At Fathom Data we do a lot of automated reporting with R. Being able to easily and reliably send emails is a high priority.
There is already a selection of packages for sending email from R:
{mailR}
{gmailr}
{blastula}
{blatr}
(Windows){mail}
and{sendmailR}
.We’ve had the most experience with the first two, both of which are really solid packages. However, {gmailr}
uses the Google Mail API so it doesn’t work with all SMTP servers and {mailR}
has a dependency on {rJava}
which can be a bit of a hurdle for deploying in some environments.
When I set up an R server for clients they often want to be able to install packages so that all users on the machine have access to them. This requires them to be able to install the packages onto the root filesystem rather that under their individual home directories.
It would be easy enough to give them su
access, but this is a risky approach. There are so many other things on the system that they could break with this level of power.
I’m helping develop a new game concept, which is based on the sliding puzzle game. The idea is to randomise the initial configuration of the puzzle. However, I quickly discovered that half of the resulting configurations were not solvable. Not good! Here are two approaches to getting a solvable puzzle:
The first option is obviously more robust. It’s also a bit more work. The second option might require a few iterations, but it’s easy to implement.
Read More →Qlik Sense is a tool for exploratory data analysis and visualisation. It’s powerful and versatile. It’s can, however, be significantly enhanced by interfacing with R.
Read More →Arrived in Paris rather late after catching the Eurostar from London. Trip nearly started on a bad note when I underestimated the time required to check-in, get through passport control and security. Sat down on the train literally as it departed.
Early start, working on my tutorial for satRday. When the Sun came up I went out for a trot, primarily to get acquainted with the neighbourhood but also to locate the grave of Jim Morrison. Arrived at Père Lachaise Cemetery to find that it only opened at 08:00. Mildly disappointed. The breakfast that I had back at the hotel made up for that though.
Read More →I need to deploy a Plumber API in a Docker container. The API has some R package dependencies which need to be baked into the Docker image. A few options for the base image:
The first option, r-base, would require building the dependencies from source, a somewhat time consuming operation. The last option, r-apt, makes it possible to install most packages using apt
, which is likely to be much quicker. I’ll immediately eliminate the other option, tidyverse
, because although it already contains a load of packages, many of those are not required and, in addition, it incorporates RStudio Server, which is definitely not necessary for this project.
Rserve
is a server which allows other programs to use the facilities of R via TCP/IP.