Scheduling Refresh of a Materialised View

Materialised views are a great alternative to views if the underlying query takes a long time to run. However, the principle problem with materialised views is that their content gets stale… and if the database is active, then it gets out of date rather quickly. Sure you can manually refresh a materialised view, but who has the discipline or time to do that? Better to automate the process. Then you can safely forget about it, secure in the knowledge that the data in the materialised view will remain current.

We’ll be addressing this issue specifically in the context of a PostgreSQL database running on Amazon RDS. We’ll be using the pg_cron extension, which has been supported by RDS for versions of PostgreSQL from 12.5.

Target Database for pg_cron Extension

By default pg_cron will target the database named "postgres". This is the default database name, so it’s quite likely that it’s the appropriate name. However, if you’ve given your database another name, then you’re going to have to tweak some settings to change the target database.

Create or edit the parameter group associated with the database. Change the value of the cron.database_name setting to the name of the target database.

Changing the target database for the pg_cron extension.

Update the settings for the database so that it uses the modified parameter group. Reboot the database so that these changes are applied.

When the database comes back up again you can check that the settings have been applied.

select name, setting from pg_settings where name like 'cron.%';
name                       |setting  |
---------------------------+---------+
cron.database_name         |prd      |
cron.host                  |localhost|
cron.log_run               |on       |
cron.log_statement         |on       |
cron.max_running_jobs      |5        |
cron.use_background_workers|on       |

Enabling pg_cron Extension

Enable the pg_cron extension.

create extension pg_cron;

That will set up the functions and tables required to support pg_cron.

Schedule an Event

Now you can add events to the pg_cron schedule by calling the cron.schedule() function.

select cron.schedule('0 * * * *', 'refresh materialized view price_ranked;')

The first argument sets the frequency of the event using the conventional cron syntax. The second argument is the query that should be executed on schedule.

You can also give the job a name by providing an optional first argument.

select cron.schedule('refresh ranked view', '0 * * * *', 'refresh materialized view price_ranked;')

You can check on scheduled jobs by querying the cron.job table.

select jobid, schedule, command, database, username, active, jobname from cron.job;
jobid|schule   |command                                |database|username|active|jobname            |
-----+---------+---------------------------------------+--------+--------+------+-------------------+
    1|0 * * * *|refresh materialized view price_ranked;|prd     |postgres|true  |refresh ranked view|

If you want to change the target database for a particular job then you can update the database field for that job.

List Scheduled Jobs

To generate a list of the jobs being managed by pg_cron:

select * from cron.job;

Deactivating Jobs

If you want to temporarily suspend a job then you can set its active field to false.

update cron.job set active = false where jobid = 6;

That job will not run again until you set its active field back to true.

update cron.job set active = true where jobid = 6;

Deleting Jobs

To delete a job call the cron.unschedule() function.

SELECT cron.unschedule(1);

Check the contents of the cron.job table and you should find that the specified job has been removed.

Job Logs

The cron.job_run_details table contains details of each of the job runs.

select jobid, runid, username, status, return_message, start_time, end_time from cron.job_run_details;
jobid|runid|username|status   |return_message             |start_time             |end_time               |
-----+-----+--------+---------+---------------------------+-----------------------+-----------------------+
    4|    6|postgres|succeeded|REFRESH MATERIALIZED VIEW  |2022-03-22 10:00:00.050|2022-03-22 10:10:30.730|
    4|    7|postgres|succeeded|REFRESH MATERIALIZED VIEW  |2022-03-22 11:00:00.060|2022-03-22 11:10:33.562|
    4|    8|postgres|succeeded|REFRESH MATERIALIZED VIEW  |2022-03-22 12:00:00.004|2022-03-22 12:10:38.183|
    4|    9|postgres|succeeded|REFRESH MATERIALIZED VIEW  |2022-03-22 13:00:00.077|2022-03-22 13:10:37.068|
    4|   10|postgres|succeeded|REFRESH MATERIALIZED VIEW  |2022-03-22 14:00:00.063|2022-03-22 14:10:31.191|
Number of active sessions versus time showing spikes due to refreshing materialised views.

Conclusion

Now walk away with the confidence that the materialised view will be regularly updated.

Of course, the pg_cron extension is very flexible and can be applied to many other scenarios beyond refreshing a materialised view.

Resources