# Blog Posts by Andrew B. Collier / @datawookie

## Optimising a Noisy Objective Function

I am busy with a project where I need to calibrate the Heston Model to some Asian options data. The model has been implemented as a function which executes a Monte Carlo (MC) simulation. As a result, the objective function is rather noisy. There are a number of algorithms for dealing with this sort of problem, and here I simply give a brief overview of some of them.

## Compiling Indicators and Expert Advisors

When you receive the code for an expert advisor or indicator which we have developed for you, it will come in a package consisting of include files (with a .mqh extension) and source code files (with a .mq4 extension). So, what do you do with them?

## Are Green Number Runners More Likely to Bail?

Comrades Marathon runners are awarded a permanent green race number once they have completed 10 journeys between Durban and Pietermaritzburg. For many runners, once they have completed the race a few times, achieving a green number becomes a possibility. And once the idea takes hold, it can become something of a compulsion. I can testify to this: I am thoroughly compelled! For runners with this goal in mind, every finish is one step closer to a green number. Read More →

## The Green Number Effect

Following up on a suggestion from my previous post, here are the statistics for medal count versus age. Every point on the plot is the number (see colour legend on right) of athletes who have achieved a given number of medals by a particular age. There is clear evidence of a Green Number Effect: many people hang on for ten medals and then pack it in. There is also weaker evidence of a Double Green Number Effect. Read More →

## Age Distribution of Comrades Marathon Athletes

I can clearly remember watching the end of the 1989 Comrades Marathon on television and seeing Wally Hayward coming in just before the final gun, completing the epic race at the age of 80! I was in awe.

Since I have been delving into the Comrades Marathon data, this got me thinking about the typical age distribution of athletes taking part. The plot below indicates the ages of athletes who finished the race, going all the way back to 1984. You can clearly spot the two years when Wally Hayward ran (1988 and 1989). My data indicates that he was only 79 on the day of the 1989 Comrades Marathon, but I am not going to quibble over a year and I am more than happy to accept that he was 80!

## Medal Allocations at the Comrades Marathon

It is a bit of a mission to get the complete data set for this year’s Comrades Marathon. The full results are easily accessible, but come as an HTML file. Embedded in this file are links to the splits for individual athletes. So with a bit of scripting wizardry it is also possible to download the HTML files for each of the individual athletes. Parsing all of these yields the complete result set, which is the starting point for this analysis.

## Analysis of Cable Morning Trade Strategy

A couple of years ago I implemented an automated trading algorithm for a strategy called the “Cable Morning Trade”. The basis of the strategy is the range of GBPUSD during the interval 05:00 to 09:00 London time. Two buy stop orders are placed 5 points above the highest high for this period; two sell stop orders are placed 5 points below the lowest low. All orders have a protective stop at 40 points. Read More →

## Balanced Data with {MatchIt}

A balanced experimental design is one in which the distribution of the covariates is the same in both the control and treatment groups. However, although often achievable in an experiment, for observational data this ideal is seldom achieved.

## xkcd Style Bubble Plot

A package was recently released to generate plots in the style of xkcd using R. Being a big fan of the cartoon, I could not resist trying it out. So I set out to produce something like one of Hans Rosling’s bubble plots.

I’ve just finished coding a swing alert indicator for a client. The rules are rather straightforward and it all depends on two simple moving averages (by default with periods of 25 and 5).

## Package {party}: Conditional Inference Trees

I am going to be using the party package for one of my projects, so I spent some time today familiarising myself with it. The details of the package are described in Hothorn, T., Hornik, K., & Zeileis, A. (1999). “party: A Laboratory for Recursive Partytioning” which is available from CRAN.

## Plotting categorical variables

In the previous installment we generated a few plots using numerical data straight out of the National Health and Nutrition Examination Survey. This time we are going to incorporate some of the categorical variables into the plots. Although going from raw numerical data to categorical data bins (like we did for age and BMI) does give you less precision, it can make drawing conclusions from plots a lot easier. We will start off with a simple plot of two numerical variables: age against BMI. Read More →

## Plotting numerical variables

In the previous installment we generated some simple descriptive statistics for the National Health and Nutrition Examination Survey data. Now we are going to move on to an area in which R really excels: making plots and visualisations.

## Descriptive Statistics

In the previous installment we derived two categorical variables. This time we will extract descriptive statistics.

## Categorical Variables

In the previous installment we sucked some data from the National Health and Nutrition Examination Survey into R and did some preliminary work. Now we are going to play the some categorical data.