Usage of disk space in Oracle and PostgreSQL

A few days ago I built a table in Oracle (11, if that matters) to store a few hundred megabytes of data. But I don’t feel at home using Oracle, so I decided to export the data and import it back in PostgreSQL 12.
Surprisingly, PostgreSQL requires more data space to store the same amount of data.

I’m not saying anything about who is the best, and I don’t know the exact reasons why this happens, however this is just what I’ve observed hpoing this can be useful to someone else!
So please don’t flame!

Table structure

The table is really simple, and holds data about files on a disk. It does not have even a key, since it is just data I must mangle and then throw away.
testdb=> \d my_schema.my_files
                        Table "my_schema.my_files"
  Column   |          Type           | Collation | Nullable | Default 
-----------|-------------------------|-----------|----------|---------
 filename  | character varying(200)  |           |          | 
 directory | character varying(2048) |           |          | 
 md5sum    | character varying(128)  |           |          | 
 bytes     | bigint                  |           |          | 

I’ve seen no changes in using text against a varchar, I used the latter just to be as similar as possible in the definition with Oracle.
The table is populated with 1872529 tuples (around 2 million tuples).

Oracle Disk Space

Oracle requires 312 MB to store the data:
 select segment_name,sum(bytes)/1024/1024 MB
    , count(segment_name)
    , blocks * 8192 / (1024 * 1024 )
    from user_segments
    where segment_type='TABLE'
    and segment_name=upper('MY_FILES')
    group by segment_name, blocks ;
The results of the above query are:
  • 312 MB of data;
  • 39936 blocks, that are something similar to PostgreSQL data pages.
The table has 110 extents, but I’m not sure how they account in the space compuation.

PostgreSQL Disk Space

The same data in PostgreSQL required 324 MB, so 12 MB more than Oracle, that is roughly 4% more of disk space. It is therefore possible to say that the overall space is pretty much the same:
testdb=> SELECT reltuples, relpages, 
         pg_size_pretty( pg_relation_size( 'my_schema.my_files' ) ) 
         FROM pg_class WHERE relname = 'my_files' AND relkind = 'r';
         
  reltuples   | relpages | pg_size_pretty 
--------------|----------|----------------
 1.872529e+06 |    41491 | 324 MB
(1 row)
Please note that fillfactor has been set to 100% and the table has been VACUUMed.

Counting Pages

What I can see, is that PostgreSQL uses 41491 data pages, while Oracle uses 39936, so 1555 less data pages. Again, that is roughly the same 4% we already saw on effective space, that lead me think the Oracle datapages have the same size as PostgreSQL.
In fact, asking for the datapage size:
SQL> show parameter db_block_size;

NAME          TYPE    VALUE 
------------- ------- ----- 
db_block_size integer 8192 
shows the same size as PostgreSQL.

From NUMERIC to INT

update of 2020-02-24
One possible difference between the two tables, is the NUMERIC data type used by Oracle. After inspecting the values, I’ve seen that the bytes column can be handled by an int4 (normal integer) value type, so I changed it in both Oracle and PostgreSQL. While in Oracle the size remained the same, 312 MB, in PostgreSQL the size shrinked down to 318 MB which is much more close to the Oracle one:
testdb=> ALTER TABLE vace.my_files ALTER COLUMN bytes SET DATA TYPE int;
ALTER TABLE
testdb=> vacuum full vace.my_files;
VACUUM
testdb=> SELECT reltuples, relpages, pg_size_pretty( pg_relation_size( 'vace.my_files' ) ) FROM pg_class WHERE relname = 'my_files' AND relkind = 'r';
  reltuples   | relpages | pg_size_pretty 
--------------|----------|----------------
 1.872529e+06 |    40757 | 318 MB
(1 row)

Conclusions

I really don’t have any. I know too little about Oracle storage to say why there is this difference in size, and I’m sure this is neither an advantage of Oracle nor a drawback of PostgreSQL.
I don’t even know if this is the default behavior for any use-case, I hardly think so, but it is interesting to know that even a simple use-case like this can require a little more space on disk.

The article Usage of disk space in Oracle and PostgreSQL: a simple use case has been posted by Luca Ferrari on February 24, 2020