This latest and greatest version of PostgreSQL, version 13, has been here since last summer and continues the tradition of gains in performance, administration, application development, and security enhancements.

Of the many new features and enhancements that have been made, some of the more notable ones include:

  • Space savings and performance gains from de-duplication of B-tree index entries
  • Improved performance for queries that use aggregates or partitioned tables
  • Better query planning when using extended statistics
  • Parallelized vacuuming of indexes
  • Incremental sorting

TIP: More detailed information can be found in the Release Notes here.

I learned long ago that it can be something of a challenge keeping up to date with all this new stuff. Therefore, I’d like to cover a small subset of these new advances by demonstrating simple use cases which I hope you will find constructive and are related to pgbench, logical replication, streaming replication fail-over, and re-provisioning a deprecated PRIMARY as a STANDBY using pg_rewind.

About pgbench

As you may recall, pgbench is a simple program for benchmark tests on PostgreSQL by running the same sequence of SQL commands over and over in multiple concurrent database sessions. Pgbench finds its most common use, as the name implies, as a benchmarking tool testing hardware and configuration runtime parameters for the purposes of performance tuning.

The latest iteration of pgbench has a number of new capabilities and includes:

  • A new set of options chooses data generation either from the client or the server side i.e. ‘g’, and ‘G’. Whereas previous versions generated data on the pgbench client and then sent it to the server, one now has the ability to test data generation exclusively on the server itself without being affected by network latency.
  • The ability to create a partitioned “accounts” table using either range or hash partitioning although the default is range partitioning.
  • A new option ‘–show-script’ which echoes the actual code of any built-in script name on stderr.

Example 1: Hash and Range Partitioning

The first step is to of course initialize the benchmarking, in this case, we start out using hash partitioning:

Here’s what it should look like:

And here’s the partitioned pgbench_accounts table definition:

 

Alternatively, we can just as easily partition using range partitioning:

For those people curious to see the actual commands creating the partition(s) just update the runtime parameter and look in your postgres log:

 

 

Example 2: Benchmarking

Part I

Let’s initialize a database with the following conditions:

  • install pg_stat_statements to see our results
  • initialize benchmarking
  • generate data solely on the server-side
  • create primary keys
  • create foreign key constraints
  • range partition table “accounts” creating five child tables

Open file postgresql.conf​ and edit the following runtime parameters. Restart the server and add extension pg_stat_statement to any database using the command create pg_stat_statement.

Initialize an empty database:

This query should return a nice summary of the commands thus far executed:

 

Part II

Execute the benchmarking for a duration of five minutes. Just to make things clearer, you can reset the stats in view pg_stat_statements.

New SQL statements representing the DML operations are now listed:

 

Notice how the child tables have grown in size:

About Logical Replication

We now explore another very useful addition to pg version 13 which is the transparent replication of partitioned tables via logical decoding. No longer is it necessary to expend valuable time manually adding the triggers and supplementary instruction replicating the child tables.

Example

Using the aforementioned pgbench example, execute the following. Note that one must update table history, by adding a primary key, otherwise, replication is not possible with this table:

And here we see the child accounts tables have been correctly replicated:

About Streaming Replication

Because of the large number of new features added to streaming replication, I’m limiting the focus on two enhancements:

  • Restarting a STANDBY in order to point it to a newly promoted PRIMARY, after updating primary_conninfo and primary_slot_name, is no longer necessary as a reload will suffice.
  • pg_rewind has a new option, –write-recovery-conf,  that simplifies the steps of reprovisioning a failed PRIMARY as a viable STANDBY, similarly to the one found in pg_basebackup.

Recall the three-node streaming replication cluster we’ve been using:

Example 1: STANDBY, pg3, Points to New PRIMARY pg2

Suppose a failover promotes pg2. This new feature adds robustness to the entire database cluster. REPLICA pg3 continues service without interruption as it is redirected from pg1 to pg2. Previous versions of postgres required host pg3 to be restarted in order to effect new primary_conninfo parameters.

Step 1

pg3-POC13; run a benchmark of SELECTS on host pg3

Step 2

pg2-POC13; promote the host while the bench-marking on pg3 is active

TIP: it’s understood that runtime parameter wal_keep_size is a non-zero value and is set sufficiently retaining WALs as pg3 is redirected away from pg1 towards pg2.

Step 3

pg3-POC13;  point to pg2-POC13

Step 4

pg2-POC13; validate replication from pg2->pg3

Example 2

When re-provisioning a failed PRIMARY, ie pg1, as a new STANDBY, it used to be necessary to edit the requisite recovery runtime configuration parameters after executing pg_rewind but before starting the host.

PostgreSQL version 13 now simplifies the exercise by providing the switch  –write-recovery-conf and updating postgresql.auto.conf with the correct runtime parameters.

TIP: pg_rewind requires runtime parameter wal_log_hints=on before re-provisioning takes place on the failed PRIMARY

Step 1

pg2-POC13; add a slot on PRIMARY pg2 for the new STANDBY pg1

Step 2

pg1-POC13;  re-provision pg1

You should get messaging similar to the following:

Now you can restart host pg1-POC13 and bring it back into service:

Login pg1 and confirm replication:

Caveat

In the case of failure, check the following:

  • pg2-POC13: As the PRIMARY you will want to check that you have a slot installed and ready before starting up the reprovisioned pg1 as a new standby
  • pg1-POC13:
    • check to see if he server started up i.e. ‘netstat -tlnp’
    • check postgresql.auto.conf if you get a FATAL server is starting up message
    • check the slot name
    • check connectivity info is correct

Conclusion

Although not germane to features and capabilities, I was a little curious to see the differences between PostgreSQL 13 and earlier versions. Here’s a table with metrics creating a data cluster, database, and time required initializing pgbench. It’s interesting to see how performance times have improved over the previous versions:

Frankly, I find it amazing how the size of both binaries and the data cluster has remained so compact over the years. I remember, way back in 2000, I was working with a proprietary RDBMS that upon initialization created an empty instance of 1GB in size – I wonder how big it gets these days??

Happy Hacking!

References:

PostgreSQL 13 Released!

PostgreSQL 13 Documentation

PostgreSQL Versioning Policy


Our white paper “Why Choose PostgreSQL?” looks at the features and benefits of PostgreSQL and presents some practical usage examples. We also examine how PostgreSQL can be useful for companies looking to migrate from Oracle.

Download PDF