PostgreSQL Job Scheduler: pg

In database management, job scheduling has always played an important role. And for PostgreSQL, PgAgent and pg_cron are examples of job schedulers that are already popular. However, there is another job scheduler called pg_timetable, which is completely database-driven and provides a couple of advanced concepts. In this blog, we are going to highlight some of the main features, installation, and some use cases of pg_timetable.

Main features of pg_timetable

Fully database-driven configuration
Cron-style scheduling at the PostgreSQL server time zone
Built-in tasks such as sending emails, etc.
Full support for database-driven logging
Tasks can be arranged in chains
A chain can consist of built-in commands, SQL, and executables
Parameters can be passed to chains/tasks

Installation

There are currently two options on how you can install and run pg_timetable.

Container installation. For this, please refer to this page.
Note: The image mentioned on the above page is based on Alpine and not from Percona.
Local installation, i.e., build from sources. Here, we will discuss this approach.

2.1. Download and install GO on your system.

2.2. Clone pg_timetable repo

 $ git clone https://github.com/cybertec-postgresql/pg_timetable.git
 $ cd pg_timetable

1 2	$ git clone https://github.com/cybertec-postgresql/pg_timetable.git $ cd pg_timetable

2.3. Run pg_timetable

$ go run main.go --dbname=dbname --clientname=worker001 --user=scheduler --password=strongpassword

1	$ go run main.go --dbname=dbname --clientname=worker001 --user=scheduler --password=strongpassword

2.4. Alternatively, build a binary and run it:

 $ go build
 $ ./pg_timetable --dbname=dbname --clientname=worker001 --user=scheduler --password=strongpassword

1 2	$ go build $ ./pg_timetable --dbname=dbname --clientname=worker001 --user=scheduler --password=strongpassword

Demonstration of job scheduling with pg_timetable

As a use case here, it will be shown how to use pg_timetable as a scheduler to schedule a job, which will refresh the materialized view every day at 12 midnight.

1. Download pg_timetable executable (Follow step 2 mentioned above in the installation section).

2. Make sure the PostgreSQL server is up and running and has a role with CREATE privilege for a target database, e.g.:

postgres=# CREATE ROLE scheduler PASSWORD '***********';
postgres=# GRANT CREATE ON DATABASE postgres TO scheduler;

postgres=# CREATE TABLE t_demo (grp int, data numeric);
CREATE TABLE
postgres=# INSERT INTO t_demo SELECT 1, random()
FROM generate_series(1, 5000000);
INSERT 0 5000000


postgres=#CREATE MATERIALIZED VIEW mat_view AS
SELECT grp, avg(data), count(*)
FROM t_demo
GROUP BY 1;
SELECT 1

postgres=# ALTER MATERIALIZED VIEW mat_view OWNER TO scheduler;
postgres=# GRANT SELECT ON mat_view TO scheduler;


-bash-4.2$ psql
psql (12.16)
Type "help" for help.


postgres=# SELECT * FROM mat_view;
grp | avg | count
-----+--------------------------+---------
1 | 0.5001807958659610956005 | 5000000
(1 row)

postgres=# INSERT INTO t_demo SELECT 2, random()
postgres-# FROM generate_series(1, 5000000);
INSERT 0 5000000

postgres=# CREATE ROLE scheduler PASSWORD '***********';

postgres=# GRANT CREATE ON DATABASE postgres TO scheduler;

postgres=# CREATE TABLE t_demo (grp int, data numeric);

CREATE TABLE

postgres=# INSERT INTO t_demo SELECT 1, random()

FROM generate_series(1, 5000000);

INSERT 0 5000000

postgres=#CREATE MATERIALIZED VIEW mat_view AS

SELECT grp, avg(data), count(*)

FROM t_demo

GROUP BY 1;

SELECT 1

postgres=# ALTER MATERIALIZED VIEW mat_view OWNER TO scheduler;

postgres=# GRANT SELECT ON mat_view TO scheduler;

-bash-4.2$ psql

psql (12.16)

Type "help" for help.

postgres=# SELECT * FROM mat_view;

grp | avg | count

-----+--------------------------+---------

1 | 0.5001807958659610956005 | 5000000

(1 row)

postgres=# INSERT INTO t_demo SELECT 2, random()

postgres-# FROM generate_series(1, 5000000);

INSERT 0 5000000

3. Create a new job to refresh the materialized view each night at 12:00 Postgres server time zone.

postgres=# SELECT timetable.add_job('refresh-matview', '0 12 * * *', 'REFRESH MATERIALIZED VIEW public.mat_view');
add_job
---------
1
(1 row)

postgres=# SELECT timetable.add_job('refresh-matview', '0 12 * * *', 'REFRESH MATERIALIZED VIEW public.mat_view');

add_job

---------

(1 row)

4. Run the pg_timetable

[centos@ip-172-31-32-10 pg_timetable]$ ./pg_timetable --dbname=postgres --clientname=worker001 --user=scheduler --password=********
2023-09-09 11:59:20.929 [INFO] [sid:697146069] Starting new session...
2023-09-09 11:59:20.941 [INFO] Database connection established
2023-09-10 12:00:00.961 [INFO] Accepting asynchronous chains execution requests...
2023-09-10 12:00:00.970 [INFO] [count:0] Retrieve scheduled chains to run @reboot
2023-09-10 12:00:00.991 [INFO] [count:3] Retrieve scheduled chains to run
2023-09-10 12:00:00.994 [INFO] [count:0] Retrieve interval chains to run
2023-09-10 12:00:00.019 [INFO] [chain:1] Starting chain
2023-09-10 12:00:00.722 [INFO] [chain:1] [task:1] [txid:2613] Starting task
2023-09-10 12:00:00.074 [INFO] [chain:1] [task:1] [txid:2613] Starting task
2023-09-10 12:00:00.141 [INFO] [chain:1] [task:1] [txid:2613] Closing remote session
2023-09-10 12:00:00.141 [INFO] [chain:1] [task:1] [txid:2613] Task executed successfully
2023-09-10 12:00:00.185 [INFO] [chain:1] [txid:2613] Chain executed successfully

[centos@ip-172-31-32-10 pg_timetable]$ ./pg_timetable --dbname=postgres --clientname=worker001 --user=scheduler --password=********

2023-09-09 11:59:20.929 [INFO] [sid:697146069] Starting new session...

2023-09-09 11:59:20.941 [INFO] Database connection established

2023-09-10 12:00:00.961 [INFO] Accepting asynchronous chains execution requests...

2023-09-10 12:00:00.970 [INFO] [count:0] Retrieve scheduled chains to run @reboot

2023-09-10 12:00:00.991 [INFO] [count:3] Retrieve scheduled chains to run

2023-09-10 12:00:00.994 [INFO] [count:0] Retrieve interval chains to run

2023-09-10 12:00:00.019 [INFO] [chain:1] Starting chain

2023-09-10 12:00:00.722 [INFO] [chain:1] [task:1] [txid:2613] Starting task

2023-09-10 12:00:00.074 [INFO] [chain:1] [task:1] [txid:2613] Starting task

2023-09-10 12:00:00.141 [INFO] [chain:1] [task:1] [txid:2613] Closing remote session

2023-09-10 12:00:00.141 [INFO] [chain:1] [task:1] [txid:2613] Task executed successfully

2023-09-10 12:00:00.185 [INFO] [chain:1] [txid:2613] Chain executed successfully

During the first start of pg_timetable, the necessary schema timetable gets created. For reference, below is the catalog structure.

postgres=# dn
List of schemas
Name | Owner
-----------+-----------
public | postgres
timetable | scheduler
(2 rows)

postgres=# set search_path to timetable ;
SET
postgres=# dt
List of relations
Schema | Name | Type | Owner
-----------+----------------+-------+-----------
timetable | active_chain | table | scheduler
timetable | active_session | table | scheduler
timetable | chain | table | scheduler
timetable | execution_log | table | scheduler
timetable | log | table | scheduler
timetable | migration | table | scheduler
timetable | parameter | table | scheduler
timetable | task | table | scheduler
(8 rows)

postgres=# dn

List of schemas

Name | Owner

-----------+-----------

public | postgres

timetable | scheduler

(2 rows)

postgres=# set search_path to timetable ;

SET

postgres=# dt

List of relations

Schema | Name | Type | Owner

-----------+----------------+-------+-----------

timetable | active_chain | table | scheduler

timetable | active_session | table | scheduler

timetable | chain | table | scheduler

timetable | execution_log | table | scheduler

timetable | log | table | scheduler

timetable | migration | table | scheduler

timetable | parameter | table | scheduler

timetable | task | table | scheduler

(8 rows)

5. From database logs, it can be observed that MATERIALIZED VIEW gets refreshed as per the schedule.

Output from DB logs:

2023-09-10 12:00:00 UTC [14334] LOG:  statement: REFRESH MATERIALIZED VIEW public.mat_view

1	2023-09-10 12:00:00 UTC [14334] LOG: statement: REFRESH MATERIALIZED VIEW public.mat_view

From psql prompt:

-bash-4.2$ psql
psql (12.16)
Type "help" for help.

postgres=# SELECT * FROM mat_view;
grp | avg | count
-----+---------------------------+---------
1 | 0.5001807958659610956005 | 5000000
2 | 0.50000009110202547559215 | 5000000
(2 rows)

-bash-4.2$ psql

psql (12.16)

Type "help" for help.

postgres=# SELECT * FROM mat_view;

grp | avg | count

-----+---------------------------+---------

1 | 0.5001807958659610956005 | 5000000

2 | 0.50000009110202547559215 | 5000000

(2 rows)

Below is the output from the pg_timetable catalog tables.

postgres=# select * from active_session ;
client_pid | server_pid | client_name | started_at
------------+------------+-------------+------------------------------
697146069 | 20137 | worker001 | 2023-09-10 11:59:205672+00
(1 row)

postgres=# select * from active_session ;

client_pid | server_pid | client_name | started_at

------------+------------+-------------+------------------------------

697146069 | 20137 | worker001 | 2023-09-10 11:59:205672+00

(1 row)

postgres=# select * from chain;
-[ RECORD 1 ]-------+--------------------
chain_id | 1
chain_name | refresh-matview
run_at | 0 12 * * *
max_instances |
timeout | 0
live | t
self_destruct | f
exclusive_execution | f
client_name |
on_error |

postgres=# select * from chain;

-[ RECORD 1 ]-------+--------------------

chain_id | 1

chain_name | refresh-matview

run_at | 0 12 * * *

max_instances |

timeout | 0

live | t

self_destruct | f

exclusive_execution | f

client_name |

on_error |

postgres=# select * from execution_log where chain_id=1;
-[ RECORD 01 ]-------------------------------------------------------------------
chain_id | 1
task_id | 1
txid | 2613
last_run | 2023-09-10 12:00:00.137404+00
finished | 2023-09-10 12:00:00.586543+00
pid | 697146069
returncode | 0
kind | SQL
command | REFRESH MATERIALIZED VIEW public.mat_view
output | REFRESH MATERIALIZED VIEW
client_name | worker001

postgres=# select * from execution_log where chain_id=1;

-[ RECORD 01 ]-------------------------------------------------------------------

chain_id | 1

task_id | 1

txid | 2613

last_run | 2023-09-10 12:00:00.137404+00

finished | 2023-09-10 12:00:00.586543+00

pid | 697146069

returncode | 0

kind | SQL

command | REFRESH MATERIALIZED VIEW public.mat_view

output | REFRESH MATERIALIZED VIEW

client_name | worker001

In conclusion, we can say that pg_timetable is open source and can be used freely by everyone. The main advantages include that pg_timetable is an independent process written in GO, which connects to PostgreSQL just like any other client program. So, if the scheduler crashes, it will not harm your server. Pg_timetable provides a variety of built-in tasks that help you to flexibly combine those operations in an easy way. Further, pg_timetable has been implemented in GO and, therefore, comes as ONE executable that can be started directly. So, there is no need to worry about libraries or dependencies during the installation.

Please refer to the links below to learn more about pg_timetable.

https://github.com/cybertec-postgresql/pg_timetable

https://www.cybertec-postgresql.com/en/products/pg_timetable/

https://pg-timetable.readthedocs.io/en/master/README.html

0 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Why Percona for MongoDB?

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

PostgreSQL Job Scheduler: pg_timetable

Main features of pg_timetable

Installation

Demonstration of job scheduling with pg_timetable

Related

Related Blog Articles

RECOMMENDED ARTICLES

pgvector: The Critical PostgreSQL Component for Your Enterprise AI Strategy

Does FCV Have Any Impact on MongoDB Performance?

Behind the Scenes: How Percona Support Diagnosed a MongoDB FTDC Freeze

MOST POPULAR ARTICLES

Auditing Login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Why Percona for MongoDB?

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

PostgreSQL Job Scheduler: pg_timetable

Main features of pg_timetable

Installation

Demonstration of job scheduling with pg_timetable

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

pgvector: The Critical PostgreSQL Component for Your Enterprise AI Strategy

Does FCV Have Any Impact on MongoDB Performance?

Behind the Scenes: How Percona Support Diagnosed a MongoDB FTDC Freeze

MOST POPULAR ARTICLES

Auditing Login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation