AWS Containers #5: Health Checks

Can we create a health check that will check if the Selenium service is available? Yes! We will need to do two things:

  • tell the crawler container to wait for the Selenium container to be HEALTHY and
  • add a health check to the Selenium container.

Let’s do it!

Adding a Health Check

  1. Change the Startup Dependency Ordering for the crawler container so that it waits for the the Selenium container to be HEALTHY. Follow the procedure above 👆.
  2. Edit the task definition for the Selenium container.
  3. Scroll down to the Healthcheck section.
  4. Enter the following into the Command box (it will return a status of 1 until the service is answering requests on port 4444):
CMD-SHELL,curl -f http://localhost:4444/ || exit 1
  1. Specify 5 seconds for the interval and 2 seconds for the timeout. Leave the remaining fields in this section blank.

You can now start the newly revised task. ⏰ Wait for the task to finish and then check the logs.

Selenium Logs

These are the logs for the Selenium container:

2021-04-26 03:06:59,525 supervisord started with pid 8
2021-04-26 03:07:00,528 spawned: 'xvfb' with pid 20
2021-04-26 03:07:00,529 spawned: 'selenium-standalone' with pid 21
2021-04-26 03:07:01,617 success: xvfb entered RUNNING state
2021-04-26 03:07:01,617 success: selenium-standalone entered RUNNING state
03:07:02.237 Selenium server version: 3.141.59, revision: e82be7d358
03:07:02.820 Launching a standalone Selenium Server on port 4444
03:07:05.018 Initialising WebDriverServlet
03:07:06.423 Selenium Server is up and running on port 4444
03:07:23.728 Detected dialect: W3C
03:07:23.827 Started new session dd3f47d924cf9eed5f7255090ebc132a
Trapped SIGTERM/SIGINT/x so shutting down supervisord...
2021-04-26 03:07:27,917 received SIGTERM indicating exit request
2021-04-26 03:07:27,918 waiting for xvfb, selenium-standalone to die
2021-04-26 03:07:28,919 stopped: selenium-standalone (by SIGTERM)
2021-04-26 03:07:29,921 stopped: xvfb (by SIGTERM)
2021-04-26 03:07:30,123	Shutdown complete
The logs above have been abridged and edited for clarity.

Nothing has changed here.

Crawler Logs

And these are the logs for the crawler:

2021-04-26 03:07:16,736 ✅ Can communicate with localhost on port 4444!
2021-04-26 03:07:26,724	Retrieved URL: https://www.google.com/.

Aha! Now we see that the first attempt to communicate with the Selenium container happens at 03:07:16. By this stage the Selenium container is already “up and running on port 4444” (see log message at 03:07:06).

So, by adding a health check, we have ensure that the dependent container only starts once the process that it depends upon is available. Nice! 🚀

Health Check Details

Health Check Command

There are two ways to run the container health check:

  • CMD — to run the command directly; or
  • CMD-SHELL — to run the command using the container’s default shell.

Health Check Exit Code

The exit code is interpreted as follows:

  • 0 — success; or
  • any other value — failure.

Health Check Examples

Here are some other examples of health checks:

  1. Checking for a 401 (Unauthorised) status code. This would be used for a target which implements basic HTTP authentication.
CMD-SHELL,curl -s -o /dev/null -w "%{http_code}" http://localhost | grep "401" || exit 1