CYBERTEC Logo

Foreign data wrapper for PostgreSQL: Performance Tuning

03.2019 / Category: / Tags: |

Foreign data wrappers are one of the most widely used feature in PostgreSQL. People simply like foreign data wrappers and we can expect that the community will add even more features as we speak. As far as the postgres_fdw is concerned there are some hidden tuning options which are not widely known by users. So let's see how we can speed up the PostgreSQL foreign data wrapper.

Foreign data wrappers: Creating a “database link”

To show how things can be improved we first have to create some sample data in “adb”, which can then be integrated into some other database:

In this case I have simply loaded 100.000 rows into a very simple table. Let us now create the foreign data wrapper (or “database link” as Oracle people would call it). The first thing to do is to enable the postgres_fdw extension in “bdb”.

In the next step we have to create the “SERVER”, which points to the database containing our sample table. CREATE SERVER works like this:

Once the foreign server is created the users we need can be mapped:

In this example the user mapping is really easy. We simply want the current user to connect to the remote database as “hs” (which happens to be my superuser).

Finally, we can link the tables. The easiest way to do that is to use “IMPORT FOREIGN SCHEMA”, which simply fetches the remote data structure and turns everything into a foreign table.

The command is really easy and shown in the next listing:

As you can see PostgreSQL has nicely created the schema for us and we are basically ready to go.

Testing postgres_fdw performance

When we query our 100.000 row table we can see that the operation can be done in roughly 7.5 milliseconds:

Let us connect to “bdb” now and see, how long the other database needs to read the data:

In this example you can see that 90 milliseconds are burned to do the same thing. So why is that? Behind the scenes the foreign data wrapper creates a cursor and fetches data in really small chunks. By default, only 50 rows are fetched at a time. This translates to thousands of network requests. If our two database servers would be further away, things would take even longer – A LOT longer. Network latency plays a crucial role here and performance can really suffer.

One way to tackle the problem is to fetch larger chunks of data at once to reduce the impact of the network itself. ALTER SERVER will allow us to set the “fetch_size” to a large enough value to reduce network issues without increasing memory consumption too much. Here is how it works:

Let us run the test and see, what will happen:

PostgreSQL Foreign Data Wrapper performance

Wow, we have managed to more than double the speed of the query. Of course, the foreign data wrapper is still slower than a simple local query. However, the speedup is considerable and it definitely makes sense to toy around with the parameters to tune it.

If you want to learn more about Foreign Data Wrappers, performance and monitoring, check out one of our other blog posts.

 


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.

0 0 votes
Article Rating
Subscribe
Notify of
guest
8 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Igor Pomaranskiy
Igor Pomaranskiy
4 years ago

Landed out on this page after googling for FWD chunk size. And that's it. 🙂

Thank you so much!

Hans-Jürgen Schönig
Hans-Jürgen Schönig
4 years ago

thank you for your feedback 🙂 share the news :). i hope your posts are useful 🙂

GLUGBU
GLUGBU
4 years ago

Hi Laurenz, this is great help for my team!
My question is does fetch size have a limit?
Oh and another one, I read somewhere that this fetch_size can be specified for server or foreign table. Should we do both for better performance?
Thank you!

laurenz
laurenz
4 years ago
Reply to  GLUGBU

The limit for fetch_size is how much RAM you have on the client side to buffer results. But then the size of network packets is limited, so you will not gain performance beyond a certain value.

It does not matter if you set it on the table or the server — setting it on the server only means that it is set on all tables that belong to that server (unless explicitly overridden on the table level).

Thiago Hoffmannbeck
Thiago Hoffmannbeck
5 years ago

Amazing tip! Use of big external tables joined with local tables is really slow... Do you have a better solution to access tables from other databases in same server?

laurenz
laurenz
5 years ago

That is unavoidable. To join tables that reside in different systems, you have to ship a semi-join from one to the other. If the tables are big, that is expensive.
I guess my advice is to stay away from architectures that require joining big tables on different servers frequently.

lalligood
lalligood
5 years ago

What a fantastic "hidden" option! Shame that it only works with postgres_fdw & not all FDWs.

laurenz
laurenz
5 years ago
Reply to  lalligood

It depends on the data source what is possible. For file_fdw, it would not make sense. In oracle_fdw, there is the prefetch option.

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    8
    0
    Would love your thoughts, please comment.x
    ()
    x
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram