Blog Posts by Andrew B. Collier / @datawookie


The Price of Fuel: How Bad Could It Get?

The cost of fuel in South Africa (and I imagine pretty much everywhere else) is a contentious topic. It varies from month to month and, although it is clearly related to the price of crude oil and the exchange rate, various other forces play an influential role.

Read More →

Encyclopaedia: SANAE IV

A contribution which I wrote for Antarctica and the Arctic Circle: A Geographic Encyclopedia of the Earth’s Polar Regions. South Africa is one of the founding signatories of the Antarctic Treaty of 1959. In 1960, the first South African National Antarctic Expedition (SANAE) team overwintered at the Norwegian base on the Fimbul Ice Shelf. A new base, SANAE I, was constructed nearby (70° 18′S 2° 22′W) and opened in 1962. Later bases, SANAE II and SANAE III, were built on the same location (72° 40′ 22″S 2° 50′ 26″W) and commissioned in 1971 and 1979 respectively. Read More →

Dealing with a Byte Order Mark (BOM)

I have just been trying to import some data into R. The data were exported from a SQL Server client in tab-separated value (TSV) format. However, reading the data into R the “usual” way produced unexpected results:

Read More →

Graph Databases

The book Graph Databases by Ian Robinson, Jim Webber and Emil Eifrem gives an engaging overview of Graph Databases, describing typical use cases and illustrating the syntax used to construct and query them. Graph Databases are a form of NoSQL database and, as such, differ significantly from the ubiquitous Relational Databases. The authors discuss a variety of scenarios where a Graph Database would be a better fit than a Relational Database, showing how they are particularly well suited to data which describe relationships between entities. Read More →

R for Business Analytics

The book R for Business Analytics by Ajay Ohri sets out to look at “some of the most common tasks performed by business analysts and helps the user navigate the wealth of information in R and its 4000 packages.” In my opinion it succeeds in covering an extensive range of topics but fails to provide anything of substantial use to its intended audience. At least, not anything that could not be uncovered by a brief internet search. Read More →

Simulating Intricate Branching Patterns with DLA

Manfred Schroeder’s book Fractals, Chaos, Power Laws: Minutes from an Infinite Paradise is a fruitful source of interesting topics and projects. He gives a thorough description of Diffusion-Limited Aggregation (DLA) as a technique for simulating physical processes which produce intricate branching structures. Examples, as illustrated below, include Lichtenberg Figures, dielectric breakdown, electrodeposition and Hele-Shaw flow.

Read More →

Creating More Effective Graphs

A few years ago I ordered a copy of the 2005 edition of Creating More Effective Graphs by Naomi Robbins. Somewhat shamefully I admit that the book got buried beneath a deluge of papers and other books and never received the attention it was due. Having recently discovered the R Graph Catalog, which implements many of the plots from the book using ggplot2, I had to dig it out and give it some serious attention.

Read More →

Standard Bank: Striving for Mediocrity

Recently I was in my local Standard Bank branch. After finally reaching the front of the queue and being helped by a reasonably courteous young man, I was asked if I would mind filling out a survey. Sure. No problem. I had been in the bank for 30 minutes, I could probably afford another 30 seconds.

Read More →

Plotting Flows with {riverplot}

I have been looking for an intuitive way to plot flows or connections between states in a process. An obvious choice is a Sankey Plot, but I could not find a satisfactory implementation in R… until I read the post by January Weiner. His {riverplot} package does precisely what I am need.

Read More →

Comrades Marathon: A Race for Geriatrics?

It has been suggested that the average Comrades Marathon runner is gradually getting older. As an “average runner” myself, I will not deny that I am personally getting older. But, what I really mean is that the average age of all runners taking part in this great event is gradually increasing. This is not just an idle hypothesis: it is supported by the data. If you’re interested in the technical details of the analysis, these are included at the end, otherwise read on for the results.

Read More →

Where to Put EAs and Indicators in New MT4 Builds

If you are creating an EA or indicator from scratch, then the MetaTrader editor places the files in the correct location and the terminal is automatically able to find them. However, if the files originate from a third party then you will need to know where to insert them so that they show up in the terminal. For older builds of MetaTrader 4 the directory structure was fairly simple.

Read More →