PostgreSQL 12: Foreign Keys and Partitioned Tables
Now that PostgreSQL 12 is out, we consider foreign keys to be fully compatible with partitioned tables. You can have a partitioned table on either side of a foreign key constraint, and everything will work correctly.
Why do I point this out? Two reasons: first, when partitioned tables were first introduced in PostgreSQL 10, they didn’t support foreign keys at all; you couldn’t create FKs on partitioned tables, nor create FKs that referenced a partitioned table. Second, because the (early days) table inheritance feature didn’t really support foreign keys either. All this means that for the first time it is possible in PostgreSQL to maintain large volumes of data while maintaining referential integrity. Now that this feature is complete, some new use cases are open to PostgreSQL that previously weren’t.
Here’s a pretty trivial example.
CREATE TABLE items ( item_id integer PRIMARY KEY, description text NOT NULL ) PARTITION BY hash (item_id); CREATE TABLE items_0 PARTITION OF items FOR VALUES WITH (modulus 3, remainder 0); CREATE TABLE items_1 PARTITION OF items FOR VALUES WITH (modulus 3, remainder 1); CREATE TABLE items_2 PARTITION OF items FOR VALUES WITH (modulus 3, remainder 2); CREATE TABLE warehouses (warehouse_id integer primary key, location text not null); CREATE TABLE stock ( item_id integer not null REFERENCES items, warehouse_id integer not null REFERENCES warehouses, amount int not null ) partition by hash (warehouse_id); CREATE TABLE stock_0 PARTITION OF stock FOR VALUES WITH (modulus 5, remainder 0); CREATE TABLE stock_1 PARTITION OF stock FOR VALUES WITH (modulus 5, remainder 1); CREATE TABLE stock_2 PARTITION OF stock FOR VALUES WITH (modulus 5, remainder 2); CREATE TABLE stock_3 PARTITION OF stock FOR VALUES WITH (modulus 5, remainder 3); CREATE TABLE stock_4 PARTITION OF stock FOR VALUES WITH (modulus 5, remainder 4);
You can see there are two foreign keys here. One points to to a regular (not partitioned) table warehouses, the other points to partitioned table items. Did you notice that each foreign key is only declared once?
There are two basic operations that you want the foreign key to provide. First, if you insert a row in stock (the referencing table) that doesn’t have a corresponding row in items or warehouses (the referenced table), an error must be raised. Second, if you delete a row in either referenced table and there are matching rows in stock, that operation must be rejected too.
Both are easily verified:
INSERT INTO stock values (1, 1, 10); ERROR: insert or update on table "stock_0" violates foreign key constraint "stock_item_id_fkey" DETAIL: Key (item_id)=(1) is not present in table "items".
Good. Then you can insert matching rows in both referenced tables and a referencing row. After that, a delete in either of the referenced tables will fail, as expected.
INSERT INTO items VALUES (1, 'item 1'); INSERT INTO warehouses VALUES (1, 'The moon'); INSERT INTO stock VALUES (1, 1, 10); DELETE FROM warehouses; ERROR: update or delete on table "warehouses" violates foreign key constraint "stock_warehouse_id_fkey" on table "stock" DETAIL: Key (warehouse_id)=(1) is still referenced from table "stock". DELETE FROM items; ERROR: update or delete on table "items_2" violates foreign key constraint "stock_item_id_fkey3" on table "stock" DETAIL: Key (item_id)=(1) is still referenced from table "stock".
(Of course, an UPDATE operation is for the former operation the same as an INSERT, and for the latter operation the same as a DELETE — meaning that both the original tuple and the modified tuple must be checked, if the UPDATE modifies the columns involved in the foreign key.)
If these examples seem lame to experienced users, it’s because these things work exactly the same for regular (not partitioned) tables since time immemorial.
In real usage, you would need indexes in the referencing columns in the stock table, if you ever modify the referenced tables. This is because the server needs to locate those referencing rows in order to know to throw an error or such. You can do that with the partitioned referencing table easily enough:
CREATE INDEX ON stock (item_id); CREATE INDEX ON stock (warehouse_id);
In this post I have shown the basics of foreign keys, and how they can be used on partitioned tables just like they can on regular tables. In a subsequent post I’ll cover a couple of additional features of those. Let me know in a comment if you like this PostgreSQL 12 improvement!
In postgres 12, how can we reference a partitioned table where the referenced column is not the partitioned column. The issue we are facing is
– referenced column needs to be an unique constraint
– cannot create an unique constraint without including the partition key
– partition key/column not in the referring table (that would be too easy)
Potential solutions
– repartition using column that is referenced
– add partition key to referring table
Both solutions are not ideal…anything that I missed?
Thanks
Dave
When you say “reference” I take it you mean create a foreign key constraint that references … ?
If so, I don’t believe there is support for creating partition-wide
UNIQUE
indexes that don’t incorporate the partitioning key. Not yet, anyway, and I don’t know off the top of my head if anyone’s actively working on it.Create a FK that references a partitioned table. I researched this a bit and while its great that pg12 supports this it is not useful unless that partitioned key is the same as also the FK in all other tables which may not make sense.
For example, if I wanted to partition on date (range) or status (list) on a given table I could not unless we add those columns to the FK on the referencing table. The only partition column we could use is ID which would be hash and will need to be adjusted as they get larger.
Yes, this is a restriction on primary keys (or, more specifically, on UNIQUE keys). It’s not an issue with foreign keys themselves.
Lifting that limitation has been discussed a number of times in the development community, but it’s not a simple problem. I don’t know of anyone actively working on this problem at present. It will likely be tackled at some point in the future.
Users can influence the timing to get features done, by funding development. 2ndQuadrant is open to doing it.
Absolutely brutal this is not supported… What is the point of having a partitioned table if you can never reference it. 99% of people don’t want to partition on the arbitrary non-descriptive column id anyways. Is there an open ticket for this? You can’t necessarily use table inheritance style partitioning as the same problem will remain, the parent table won’t even have the key value so you cannot really add the unique constraint…
Introducing partitioned tables in PostgreSQL was not an easy task, and it took an enormous amount of effort from a lot of people across several different companies. We always knew we could never introduce the complete feature in a single release, so it was agreed that some things would not be supported from the get-go but instead added incrementally over time. Adding support for FKs was one of those things we left out of the initial partitioned table implementation, but now (in PostgreSQL 12) it is finally done.
We still need to complete the implementation of primary keys, so that you can have a PK that does not include the whole partition key.
I am not criticizing the amazing work that has been done and will be done with PG partitioning. I was merely trying to understand the current limits and state a case on why we need it fully implemented sooner vs later.
Until partitioning can be fully implemented, we have opted to use triggers and functions to check the partitioned table to maintain integrity. Sure there is a penalty but we only need to check for INSERTS and UPDATES (we do not delete) and our system is 99% read.