I’ve taken another look at the {hagr} data, which I wrote about previously. This time I’m focusing on the hierarchy of creatures.
Taxonomic Rank
The Linnaean Taxonomy is a hierarchical classification system for organisms devised by Carl Linnaeus. An organism is assigned to the following levels in the hierarchy (in increasing order or granularity):
domain
kingdom
phylum
class
order
family
genus and
species.
The relative level of a group of organisms in this hierarchy determines its taxonomic rank.
I came across the Human Ageing Genomic Resources. They are doing some fascinating work and expose some engrossing data. I wanted to make the data easier for me to work with, and an R package seemed to be the natural vehicle to do this.
For more information on these data, take a look at this article: Tacutu, Craig, Budovsky, Wuttke, Lehmann, Taranukha, Costa, Fraifeld and de Magalhaes, “Human Ageing Genomic Resources: Integrated databases and tools for the biology and genetics of ageing,” Nucleic Acids Research 41(D1):D1027-D1033, 2013.
Has the price of Easter Eggs shot up since last year? Let’s use data from Trundler to investigate. I’ll do the analysis in R using the {trundler} package.
The Google Mobility Data (or Community Mobility Reports) refers to the datasets provided by Google which track how people move and congregate in various locations during specific time periods. The data is based on anonymised location information from users who have opted into Location History on their Google accounts.
Fathom Data is working on a project to reproduce the figures from the CORE textbook The Economy using R and {ggplot2}. There’s a strict style guide which specifies the figure aesthetics including colours and font. We’re a team of seven people working on as many different setups. The principle challenges have been package versions and fonts.
I’ve been following an excellent tutorialfor deploying a Docker image on an EC2 instance via GitLab CI/CD. It covers every step in the process in great detail. If you follow the steps then you’ll definitely end up with a working pipeline.
However, I still wasn’t quite sure how to handle the environment variables and credentials that I wanted to bake into the image, and which varied between my local development environment and the final deployed image.
📢 An updated version of this post reflecting recent changes in GitLab Runner can be found here.
I’ve got a project which takes a long time to build. And I rebuild it regularly. I’ve been using the shared runners on GitLab. However, the total time constraint has become a limitation. I’m going to install GitLab Runner as a Docker service on an underutilised EC2 instance.
I’ve been hosting a MySQL database on a DigitalOcean server for a few of years. The project has been on hold for a while. Entropy kicked in and the server became unreachable. Fortunately I was still able to access the server via a recovery console to export the database using mysqldump and download the resulting SQL dump file.
Now I want to resurrect the database locally but I also want to migrate it to PostgreSQL.
I started using rain (a South African ISP) back in March 2019. The coverage was good (the only place I couldn’t get a signal was at Lanseria Airport), while the bandwidth was consistently high. I loved the fact that it was affordable, reliable and portable.
I have a project where I need to have a persistent Selenium session. There’s a script which will leave a browser window open when it exits. When the script runs again it should connect to the same window.
I’ve been adding topological maps of South Africa to the saffer-data-map repository. The maps are originally in MrSID format (.sid files), which is a proprietary file format developed by LizardTech (a company which has been consumed by Extensis).