It is a bit of a mission to get the complete data set for this year’s Comrades Marathon. The full results are easily accessible, but come as an HTML file. Embedded in this file are links to the splits for individual athletes. With a bit of scripting wizardry it is also possible to download the HTML files for each of the individual athletes. Parsing all of these yields the complete result set, which is the starting point for this analysis.
Read More →A balanced experimental design is one in which the distribution of the covariates is the same in both the control and treatment groups. However, although often achievable in an experiment, for observational data this ideal is seldom achieved.
Read More →A package was recently released to generate plots in the style of xkcd using R. Being a big fan of the cartoon, I could not resist trying it out. I set out to produce something like one of Hans Rosling’s bubble plots.
Read More →I’ve just finished coding a swing alert indicator for a client. The rules are rather straightforward and it all depends on two simple moving averages (by default with periods of 25 and 5).
Read More →I am going to be using the party package for one of my projects, so I spent some time today familiarising myself with it. The details of the package are described in Hothorn, T., Hornik, K., & Zeileis, A. (1999). “party: A Laboratory for Recursive Partytioning” which is available from CRAN.
Read More →In the previous installment we generated some simple descriptive statistics for the National Health and Nutrition Examination Survey data. Now we are going to move on to an area in which R really excels: making plots and visualisations.
Read More →In the previous installment we derived two categorical variables. This time we will extract descriptive statistics.
Read More →In the previous installment we sucked some data from the National Health and Nutrition Examination Survey into R and did some preliminary work. Now we are going to play with some categorical data.
Read More →A year or so ago I went to a talk which included a diagram showing the locations of Earth’s fleet of geosynchronous satellites. According to the speaker, the information in this diagram was already quite dated: the satellites and their locations had changed.
Read More →