Gatsby running out of heap space

One day your Gatsby site is building fine and the next it’s breaking with a JavaScript heap out of memory
error. What’s gone wrong and how can you fix it?
One day your Gatsby site is building fine and the next it’s breaking with a JavaScript heap out of memory
error. What’s gone wrong and how can you fix it?
At present there are two viable formats for the Transparency in Coverage data: JSON and XML. In this post we’ll dig into what the JSON files look like.
Read More →I have a challenge: extracting data from an enormous JSON file. The structure of the file is not ideal: it’s a mapping at the top level, which means that for most standard approaches the entire document needs to be loaded before it can be processed. It would have been so much easier if the top level structure was an array. But, alas. It’s almost as if the purveyors of the data have made it intentionally inaccessible.
Read More →The Transparency in Coverage Act (bill currently before congress) is a set of regulations that aim to increase transparency in health insurance coverage in the USA. The primary goal of the act is to provide consumers with clear, accessible, and actionable information about the cover that they receive from their health insurance. What services are included? How much will the insurer pay for a specific service? And how does this change from one provider to another? Or from one geographic region to another? Answers to these kinds of questions were previously hard, if not impossible, for a consumer to access.
In principle the information covered by the regulations should include costs, benefits, and other essential details. It should ensure that consumers can make informed healthcare decisions and understand the financial implications of their choices.
Read More →Code that moves data to and from S3 can slow down testing. A lot. This post demonstrates how you can speed things up by mocking S3.
Read More →Alembic can autogenerate migrations. This is probably its most valuable feature. However, I had a situation where --autogenerate
kept on creating migrations for the databasechangelog
and databasechangeloglock
tables. These are Liquibase tables and should never feature in the Alembic migrations.
The solution was to tell Alembic to ignore these tables by updating the env.py
module.
I need a list of medical conditions for a project. There are many potential sources for such a list. I selected the list published by NHS inform.
Read More →Marshmallow can readily handle nested schemas. But sometimes it’s preferable to flatten that schema for loading and/or dumping the data. The fields.Pluck()
class makes this possible.
In a previous post I documented the process of setting up a GitLab Runner using the gitlab/gitlab-runner
Docker image. As of GitLab Runner v16.0.0 the registration process has changed somewhat. This is an update to reflect that change.
Many of my projects now involve building a Docker image. The image is generally pushed to a registry as part of a CI workflow. This is how I push an image to Docker Hub from GitLab CI.
Read More →The data in the table below gives (manufacturer) specifications for a selection of kayaks and canoes. The data were originally compiled from two sources:
The data has been revised and expanded to include other manufacturers and more recent models. It has also been cleaned to some extent, but there is still work to be done. Please let me know if you spot any errors or omissions.
Read More →I prefer to have my primary key columns first in a table. I recognise that column order is irrelevant to the performance of the table, but I prefer this for personal aesthetic reasons. However, from SQLAlchemy 2.0.0 there’s a change in the way that column order works with inherited base classes.
Read More →Do you ever contribute to a Git repository from different machines? Yeah, you probably do. Sometimes you’re on your work machine. Other times you’re on your personal laptop. Or your gaming desktop. And you might have a different Git identity on each of those. And this means that your Git log ends up looking a bit messy. Who are all of these people with similar names but different email addresses? A .mailmap
file can be used to tidy things up.
Getting Gatsby (also GatsbyJS) installed and running can be a challenge. With older versions of Ubuntu I have fought extensively with Node package versions. Docker seems to be a natural solution. This post shows how to build and run a simple Docker image for serving a development Gatsby site.
Read More →If you use BASH, then you’re probably already using the command history. BASH history allows you to access a list of previous commands executed in the shell. It can make you more productive and efficient: do more and do it quicker.
The default configuration of BASH history will suit most purposes. But, like most things in the Linux universe, it’s possible to tweak that configuration to suit your specific requirements. In this post I’ll present some of those configuration options.
Read More →Do you do any web scraping? If so, then you probably spend a lot of time scratching around in your browser’s Developer Tools, figuring out the DOM structure and understanding how various bits of a site are delivered. Wouldn’t it be cool to access the Developer Tools functionality from inside your scraper? Well, you can. The Chrome DevTools Protocol (CDP) provides a low-level interface for interacting with Chrome. And you can tap into that interface via Selenium.
Read More →Message IDs (MIDs) and Content IDs (CIDs) are used to identify and refer to email messages and specific pieces of content within those messages. They are formally described in RFC 2392.
Read More →The {emayili}
package already has adapters which make it simple to send email via a variety of services. I have just added an adapter for the ZeptoMail transactional email service. The adapter is available in {emayili}
version 0.7.13.
I recently upgraded to Ubuntu 22.10. Everything went very smoothly, but there are some minor teething problems. One of them is getting the AWS Workspace client to work. This is how I fixed that.
Read More →There’s one major problem with ChromeDriver: anti-bot services are able to detect that a browser session is being automated (as opposed to being used by a regular meat sack) and will often impose restrictions or deny connections altogether. The Undetected ChromeDriver (undetected-chromedriver
) Python package is a patched version of ChromeDriver which avoids triggering a selection of anti-bot services, allowing it to glide under the anti-bot radar.