Wednesday, April 24, 2024

PostgreSQL's memory allocations

There's a thread on hackers about recovering memory consumed by paths. A reference count is maintained in each path. Once paths are created for all the upper level relations that a given relation participates in, any unused paths, for which reference count is 0, are freed. This adds extra code and CPU cycles to traverse the paths, maintain reference counts and free the paths. Yet, the patch did not show any performance degradation. I was curious to know why. I ran a small experiment.

Experiment

I wrote an extension palloc_test which adds two SQL-callable functions palloc_pfree() and mem_context_free() written in C. Function definitions can be found here. The first function palloc's some memory and then pfree's it immediately. Other function just palloc's but never pfrees, assuming that the memory will be freed when the per-tuple memory context is freed. Both functions take the number of iterations and size of memory allocated in each iteration respectively as inputs. These functions return amount of time taken to execute the loop allocating memory. It appears that the first function spends CPU cycles to free memory and the second one doesn't. So the first one should be slower than the second one.

Results

The table below shows the amount of time reported by the respective functions to execute the loop as many times as the value in the first column, each iteration allocating 100 bytes. The figure shows the same as a plot. The time taken to finish the loop increases linearly for both the function indicating that the palloc logic is O(n) in terms of number of allocations. But the lines cross each other around 300K allocations.


 

countpalloc_pfreememory context reset
1000.00290.007124
1001002.56465.079862
2001005.168210.375552
3001007.637315.704286
40010010.182719.038238
50010012.701323.847599
60010015.283828.708501
70010017.825536.982928
80010020.371841.863638
90010023.070644.332727
100010051.331152.546201
200010056.7407104.747792
300010076.3961154.225157
4000100102.3415206.510045
5000100126.1954256.367685
6000100155.8812314.178951
7000100179.9267367.597501
8000100206.2112420.003351
9000100234.7584474.137076


Inference and conclusion

This agrees with the observations I posted on the thread. Instead of letting all the useless path to be freed when query finishes, freeing them periodically during planning is time efficient as well as memory efficient. It compensates for the extra CPU cycles spent to maintain reference counts, traverse and free paths.

The actual memory allocation and freeing pattern as implemented in that patch is different from that in the experiment, so it might be worth repeating those experiments by simulating similar pattern.

I used chunk size of 100 since I thought it's closer to the order of average path size. But it might be worth repeating the experiment with larger chunk sizes to generalize the result.

No comments:

Post a Comment