Usage of disk space in Oracle and PostgreSQL
A few days ago I built a table in Oracle (11, if that matters) to store a few hundred megabytes of data. But I don’t feel at home using Oracle, so I decided to export the data and import it back in PostgreSQL 12.Surprisingly, PostgreSQL requires more data space to store the same amount of data.
I’m not saying anything about who is the best, and I don’t know the exact reasons why this happens, however this is just what I’ve observed hpoing this can be useful to someone else!
So please don’t flame!
Table structure
The table is really simple, and holds data about files on a disk. It does not have even a key, since it is just data I must mangle and then throw away.testdb=> \d my_schema.my_files
Table "my_schema.my_files"
Column | Type | Collation | Nullable | Default
-----------|-------------------------|-----------|----------|---------
filename | character varying(200) | | |
directory | character varying(2048) | | |
md5sum | character varying(128) | | |
bytes | bigint | | |
text
against a varchar
, I used the latter just to be as similar as possible in the definition with Oracle.
The table is populated with
1872529
tuples (around 2 million tuples).
Oracle Disk Space
Oracle requires312 MB
to store the data:
select segment_name,sum(bytes)/1024/1024 MB
, count(segment_name)
, blocks * 8192 / (1024 * 1024 )
from user_segments
where segment_type='TABLE'
and segment_name=upper('MY_FILES')
group by segment_name, blocks ;
312 MB
of data;39936
blocks, that are something similar to PostgreSQL data pages.
110
extents, but I’m not sure how they account in the space compuation.
PostgreSQL Disk Space
The same data in PostgreSQL required324 MB
, so 12 MB
more than Oracle, that is roughly 4% more of disk space. It is therefore possible to say that the overall space is pretty much the same:
testdb=> SELECT reltuples, relpages,
pg_size_pretty( pg_relation_size( 'my_schema.my_files' ) )
FROM pg_class WHERE relname = 'my_files' AND relkind = 'r';
reltuples | relpages | pg_size_pretty
--------------|----------|----------------
1.872529e+06 | 41491 | 324 MB
(1 row)
fillfactor
has been set to 100% and the table has been VACUUM
ed.
Counting Pages
What I can see, is that PostgreSQL uses41491
data pages, while Oracle uses 39936
, so 1555
less data pages. Again, that is roughly the same 4% we already saw on effective space, that lead me think the Oracle datapages have the same size as PostgreSQL.
In fact, asking for the datapage size:
SQL> show parameter db_block_size;
NAME TYPE VALUE
------------- ------- -----
db_block_size integer 8192
From NUMERIC
to INT
update of 2020-02-24
One possible difference between the two tables, is the
NUMERIC
data type used by Oracle. After inspecting the values, I’ve seen that the bytes
column can be handled by an int4
(normal integer) value type, so I changed it in both Oracle and PostgreSQL. While in Oracle the size remained the same, 312 MB
, in PostgreSQL the size shrinked down to 318 MB
which is much more close to the Oracle one:
testdb=> ALTER TABLE vace.my_files ALTER COLUMN bytes SET DATA TYPE int;
ALTER TABLE
testdb=> vacuum full vace.my_files;
VACUUM
testdb=> SELECT reltuples, relpages, pg_size_pretty( pg_relation_size( 'vace.my_files' ) ) FROM pg_class WHERE relname = 'my_files' AND relkind = 'r';
reltuples | relpages | pg_size_pretty
--------------|----------|----------------
1.872529e+06 | 40757 | 318 MB
(1 row)
Conclusions
I really don’t have any. I know too little about Oracle storage to say why there is this difference in size, and I’m sure this is neither an advantage of Oracle nor a drawback of PostgreSQL.I don’t even know if this is the default behavior for any use-case, I hardly think so, but it is interesting to know that even a simple use-case like this can require a little more space on disk.