Blog Posts by Andrew B. Collier / @datawookie


PLOS Subject Keywords: Association Rules

In a previous post I detailed the process of compiling data on subject keywords used in articles published in PLOS journals. In this instalment I’ll be using those data to mine Association Rules with the arules package. Good references on the topic of Association Rules are Section 14.2 of The Elements of Statistical Learning (2009) by Hastie, Tibshirani and Friedman; and Introduction to arules by Hahsler, Grün, Hornik and Buchta. Read More →

ubeR: A Package for the Uber API

Uber exposes an extensive API for interacting with their service. ubeR is a R package for working with that API which Arthur Wu and I put together during a Hackathon at iXperience. Installation The package is currently hosted on GitHub. Installation is simple using the devtools package. devtools::install_github("DataWookie/ubeR") library(ubeR) Authentication To work with the API you’ll need to create a new application for the Rides API. Set Redirect URL to http://localhost:1410/. Read More →

Garmin ANT on Ubuntu

I finally got tired of booting up Windows to download data from my Garmin 910XT. I tried to get my old Ubuntu 15.04 system to recognise my ANT stick but failed. Now that I have a stable Ubuntu 16.04 system the time seems ripe. openant Install openant, a Python library for downloading and uploading files from ANT-FS compliant devices. Download the zip file from https://github.com/Tigge/openant. Unpack the archive and install using sudo python setup. Read More →

Sportsbook Betting (Part 2): Bookmakers’ Odds

In the first instalment of this series we gained an understanding of the various types of odds used in Sportsbook betting and the link between those odds and implied probabilities. We noted that the implied probabilities for all possible outcomes in an event may sum to more than 100%. At first sight these seems a bit odd. It certainly appears to violate the basic principles of statistics. However, this anomaly is the mechanism by which bookmakers assure their profits. Read More →

feedeR: Reading RSS and Atom Feeds from R

I’m working on a project in which I need to systematically parse a number of RSS and Atom feeds from within R. I was somewhat surprised to find that no package currently exists on CRAN to handle this task. So this presented the opportunity for a bit of DIY.

You can find the fruits of my morning’s labour here.

Read More →

Sportsbook Betting (Part 1): Odds

This series of articles was written as support material for Statistics exercises in a course that I’m teaching for iXperience. In the series I’ll be using illustrative examples for wagering on a variety of Sportsbook events including Horse Racing, Rugby and Tennis. The same principles can be applied across essentially all betting markets.

Read More →