The Many Ways to Return Data From SQL DML

Probably the hardest thing to standardise in SQL is RETURNING data from DML statements. In this article, we’ll look at various ways of doing that with jOOQ, in many of jOOQ’s supported dialects, and with JDBC directly.

How to do it with jOOQ

Assuming the usual table from the sakila database:

CREATE TABLE actor (
  id INT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
  first_name TEXT,
  last_name TEXT,
  last_update TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

jOOQ took syntactic inspiration from Firebird, MariaDB, PostgreSQL, Oracle PL/SQL, which have quite the intuitive syntax. On any DML statement (i.e. INSERT, UPDATE, DELETE), you can just append RETURNING, like this:

INSERT INTO actor (first_name, last_name)
VALUES ('John', 'Doe')
RETURNING id, last_update

In summary, the RETURNING clause acts like any projection, i.e. what you usually do with SELECT. That means, you can also just return everything:

INSERT INTO actor (first_name, last_name)
VALUES ('John', 'Doe')
RETURNING *

So, in jOOQ, this just translates to

ActorRecord actor
ctx.insertInto(ACTOR, ACTOR.FIRST_NAME, ACTOR.LAST_NAME)
   .values("John", "Doe")
   .returning()
   .fetchOne();

Alternatively, use returningResult() to return arbitrary projections. We’ve already seen how this is translated to:

  • Firebird
  • MariaDB
  • PostgreSQL (and related dialects, such as CockroachDB, YugabyteDB)
  • Oracle PL/SQL

Let’s look into some details of how others support the syntax:

How this is supported in PL/SQL

While Oracle PL/SQL supports the syntax per se, it isn’t exactly equivalent to PostgreSQL’s. First off, when jOOQ knows that it’s only inserting a single row, then it delegates the returning of data to the JDBC driver via Statement.RETURN_GENERATED_KEYS. So the generated SQL doesn’t have any trace of the RETURNING clause (although ojdbc will add it later on). See below for more details.

But when jOOQ doesn’t know the number of rows, or when it knows you’re inserting more than one row, then it falls back to a much more sophisticated emulation (actual contents may differ):

DECLARE

  -- Variables for input data
  i0 DBMS_SQL.VARCHAR2_TABLE;
  i1 DBMS_SQL.VARCHAR2_TABLE;

  -- Variables for output data
  o0 DBMS_SQL.VARCHAR2_TABLE;
  o1 DBMS_SQL.TIMESTAMP_TABLE;
  c0 sys_refcursor;
  c1 sys_refcursor;
BEGIN

  -- Input data
  i0(1) := ?;
  i0(2) := ?;
  i1(1) := ?;
  i1(2) := ?;

  -- Use Oracle's FORALL statement for bulk insertion
  FORALL i IN 1 .. i0.count
    INSERT INTO actor (first_name, last_name)
    VALUES (i0(i), i1(i))
    RETURNING id, last_update
    BULK COLLECT INTO o0, o1;

  -- Fetch the update count
  ? := sql%rowcount;

  -- Fetch the returned data
  OPEN c0 FOR SELECT * FROM table(o0);
  OPEN c1 FOR SELECT * FROM table(o1);
  ? := c0;
  ? := c1;
END;

It’s quite the laborious task to generate this SQL, as well as to implement the correct sequence of JDBC API calls to bind all inputs and return all outputs. Certainly not something you’d want to write manually.

I’m still hoping Oracle will improve their support for combining various syntactic features to make the above workarounds obsolete. None of this is necessary in PostgreSQL, which supports almost the same syntax inside of the database.

How this is supported in Db2, H2, standard SQL

The SQL standard has a native syntax for this, and it’s slightly more powerful than PostgreSQL’s, although I’d say quite less readable. It’s called the <data change delta table>, and it looks something like this:

SELECT id, last_update
FROM FINAL TABLE (
  INSERT INTO actor (first_name, last_name)
  VALUES ('John', 'Doe')
) a

So, instead of tweaking the INSERT statement syntax, there’s a { OLD | NEW | FINAL } TABLE operator, which takes a DML statement as argument with the following modifiers:

  • OLD: returns the data as it was prior to any default / trigger generated values or before the UPDATE, DELETE of data, in case that type of statement was executed.
  • NEW: returns the data as it was after any default or the UPDATE in case that type of statement was executed, but before trigger generated values
  • FINAL: returns the data as it was actually inserted, i.e. after all trigger generated values, or after the UPDATE in case that type of statement was executed

You can then further process results from such inserts directly in SQL, with a few limitations (e.g. some joins, set operations, aggregations are forbidden, but you can filter and project).

The most powerful use-case for these keywords is when used with an UPDATE statement, as it gives access to data before or after the actual UPDATE was made.

Regrettably, you have to make up your mind about a single keyword. You cannot access all 3 versions of the data in a single statement (see SQL Server’s OUTPUT further down, which can do this), e.g. when implementing an UPDATE audit log.

Note that unlike RETURNING, this syntax also works with MERGE

Let’s get back to PostgreSQL once more

If you want to do something as powerful as the above, in PostgreSQL, there’s an arcane syntactic feature where you can place RETURNING statements in a CTE (WITH clause) as follows:

WITH
  a (id, last_update) AS (
    INSERT INTO actor (first_name, last_name)
    VALUES ('John', 'Doe')
    RETURNING id, last_update
  )
SELECT *
FROM a;

Curiously, you can’t do the same thing in a derived table. E.g. this won’t work, even if otherwise, CTE and derived tables are largely logically equivalent:

-- Doesn't work:
SELECT *
FROM (
  INSERT INTO actor (first_name, last_name)
  VALUES ('John', 'Doe')
  RETURNING id, last_update
) a (id, last_update);

The other dialects that mimicked PostgreSQL’s syntax don’t support the above, i.e. Firebird, MariaDB, and Oracle PL/SQL.

SQL Server’s OUTPUT clause

SQL Server has an OUTPUT clause that might be a bit weird, syntactically, but it’s also a bit more powerful even than Db2’s FINAL TABLE as it allows to access data both from before and after the modifications at the same time.

For this, SQL Server introduces INSERTED and DELETED pseudo tables, which contain the data after or before the UPDATE.

The drawback is that there’s no native way to access trigger generated values in SQL Server, so jOOQ’s emulation is a bit more elaborate,

-- Declare an in-memory table for the results
DECLARE @result TABLE (
  id INT,
  last_update DATETIME2
);

-- Insert the data and return the results into the in-memory table
INSERT INTO actor (first_name, last_name)
OUTPUT inserted.id, inserted.last_update
INTO @result
VALUES ('John', 'Doe');

-- Merge trigger generated values into the previous results
MERGE INTO @result r
USING (
  SELECT actor.id, actor.last_update AS x
  FROM actor
) s
ON r.id = s.id
WHEN MATCHED THEN UPDATE SET last_update = s.x;

-- Return the results to the client
SELECT id, last_update
FROM @result;

Again, the actual SQL may differ a bit, but you get the point. This all needs to be done to be able to fetch trigger generated values. Currently, jOOQ’s runtime isn’t aware of tables having triggers, although that might change in the future with #13912.

Note that unlike RETURNING, this syntax also works with MERGE

Using JDBC to fetch generated keys (Oracle, HSQLDB)

As mentioned above for Oracle, we can also use JDBC to fetch generated keys. In principle, this is how it works:

try (PreparedStatement s = c.prepareStatement(
    "INSERT INTO actor (first_name, last_name) VALUES (?, ?)", 
    new String[] { "ID", "LAST_UPDATE" }
)) {
    s.setString(1, firstName);
    s.setString(2, lastName);
    s.executeUpdate();

    try (ResultSet rs = s.getGeneratedKeys()) {
        while (rs.next()) {
            System.out.println("ID = " + rs.getInt(1));
            System.out.println("LAST_UPDATE = " + rs.getTimestamp(2));
        }
    }
}

Unfortunately, apart from HSQLDB and Oracle, almost no JDBC driver implements this!

Using JDBC to fetch generated keys (others)

In most other SQL dialects, we have to do something like this:

try (PreparedStatement s = c.prepareStatement(
    "INSERT INTO actor (first_name, last_name) VALUES (?, ?)", 
    Statement.RETURN_GENERATED_KEYS
)) {
    s.setString(1, firstName);
    s.setString(2, lastName);
    s.executeUpdate();

    try (ResultSet rs = s.getGeneratedKeys()) {
        System.out.println("ID = " + rs.getInt(1));
        
        // But there's no way to access LAST_UPDATE here. We'll
        // have to run another query
    }
}

Many JDBC drivers support this in some way, but not all of them support this approach:

  • For multiple inserted rows
  • For statements other than INSERT
  • For tables without an identity, or where the identity isn’t the primary key

Conclusion

As ever so often, the differences between various SQL vendors are vast, both in terms of:

  • SQL support
  • JDBC support

jOOQ has been hacking JDBC for you, so you don’t have to. With jOOQ, all of the above typically works like this, on all dialects, at least when you’re inserting a single row:

ActorRecord actor
ctx.insertInto(ACTOR, ACTOR.FIRST_NAME, ACTOR.LAST_NAME)
   .values("John", "Doe")
   .returning()
   .fetchOne();

2 thoughts on “The Many Ways to Return Data From SQL DML

  1. Thanks for the great write up. Is there some way to force jOOQ into preferring a different workaround than the default for Oracle? We find that the JDBC-based solution fails on views with instead of-triggers. I’m suddenly hopeful that you might have my back here after all. 🤞

Leave a Reply