Oracle optimizer removing or coalescing subqueries

The Oracle optimizer often changes the query to get better performance. In this article, Jonathan Lewis explains two more optimizations involving subqueries.

Jonathan Lewis' continuing series on the Oracle optimizer and how it transforms queries into execution plans:

  1. Transformations by the Oracle Optimizer
  2. The effects of NULL with NOT IN on Oracle transformations
  3. Oracle subquery caching and subquery pushing
  4. Oracle optimizer removing or coalescing subqueries
  5. Oracle optimizer Or Expansion Transformations

So far, this series has examined the shape of some of the execution plans that Oracle’s optimizer can produce when there are subqueries in the WHERE clause and noted that the optimizer will often unnest a subquery to produce a join instead of using a filter operation that repeatedly runs the subquery. You’ve also seen that it’s possible to dictate the optimizer’s choice and have some control over the point in the execution plan where the filter operation takes place.

This installment moves on to a couple of the more sophisticated transformations that the optimizer can apply to reduce the number of subqueries that end up in the execution plan.

Removing aggregate subqueries

If you browse any of the public Oracle forums you’ve probably seen suggestions that certain patterns of queries would be more efficient if they were rewritten to use analytic functions rather than using a strategy involving self-referencing subqueries. It’s not commonly known that you don’t necessarily have to rewrite such queries; you may simply be able to tell the optimizer to do an internal rewrite for you.

To demonstrate this, I’ll use the emp and dept tables from the scott schema ($ORACLE_HOME/rdbms/admin/utlsampl.sql) as I did in the previous installment, running the demonstration from SQL*Plus on Oracle 19c (19.11.0.0). I’m going to both extend and simplify the query I wrote in that installment for ‘employess with a salary greater than the departmental average.’ The extension is that I’m going to include the department name in the output; the simplification is that I’m going to remove (temporarily) the reference to nvl(comm,0):

You might not expect the addition of the department name or the elimination of the nvl() expression to make a significant difference to the optimizer, but the resulting plan is dramatically different. Thanks to referential integrity, the dept table behaves as a sort of ‘focal point,’ allowing the optimizer to connect the two appearances of the emp table and use a mechanism to ‘remove aggregate subquery.’ Here’s the execution plan for the modified query (generated by autotrace):

The optimizer has eliminated the aggregate subquery, reducing the query from three tables and two blocks to a simple two-table join with an analytic aggregate – at least that’s what the window sort operation and the generated view name vw_wif_1 suggest.

It’s a reasonable guess that the internal rewrite uses the avg() over approach, but this is a case where you probably ought to check the 10053 (CBO) trace file if you want to be confident about the details. Here’s the ‘unparsed’ query (extracted from the trace file but with a lot of cosmetic editing) that the optimizer finally produced for costing this plan:

It’s interesting to note that the optimizer has introduced a CASE expression in the inline view (vw_wif_1) to generate a column that can be tested for nullity in the main query block. The human version for this strategy would probably have been to generate the ‘average over department’ as a column in the inline view that could be compared with sal in the outer query, e.g.:

I said I’d simplified the query by removing the reference to nvl(comm,0); I had to do this initially because the transformation would otherwise not be used. Eventually, I did get the transformation to appear with the expression sal + nvl(comm,0), but I had to add a virtual column to the table matching the expression and rewrite the query using the virtual column:

This anomaly is probably one introduced in Oracle 19.9 by a fix for bug 30593046 which produced wrong results in some cases involving ‘scalar subquery unnesting’ in 12.2.0.1. Unfortunately, the fix blocked too many cases where unnesting was safe, so a replacement bug fix (33325981) appeared in 19.13 to restore some of the blocked cases. However, my example (where unnesting is a necessary precursor to subquery removal) still isn’t allowed past the block. When I ran the test case on 12.2.0.1, the optimizer did unnest the subquery with the original expression but didn’t carry on to remove the subquery.

In previous installments, I’ve discussed using hints to control the optimizer’s strategy when you think you know more about the data than the optimizer does. In this case, the transformation relies on sorting (i.e., the Window Sort at operation 3), so there are likely to be occasions when a bad choice of plan does a lot more work than the optimizer’s arithmetic suggests, and you might want to block the transformation. Conversely there may be cases where the transformation doesn’t appear when it would be a really good idea. Unfortunately, even though the transformation has been available since 10gR2, there is no custom hint to force it or block it. The best you can do if you want to stop the transformation from taking place is to disable the feature using an alter session command or adding the opt_param() hint to the query: opt_param(‘_remove_aggregate_subquery’,’false’). However, if you think the optimizer isn’t using the transformation when it should be, there’s not a lot you can do about it. It’s possible that you may find cases where a hint to unnest the critical subquery will result in the optimizer deciding to go one step further and remove the subquery – but that’s just a conjecture, I haven’t yet produced an example to demonstrate that this is actually possible.

Coalescing subqueries

Consider the following query which will be repeated in its full context a little later:

When you examine the two subqueries, you’ll notice that they are identical, and both are driven by the same bind variable :p_id. The same subquery appears twice because the result is compared with two different columns from table test_t1, so it’s tempting to ask the question: “can we get each row just once then compare it twice?”

Since the optimizer often converts IN subqueries into existence subqueries you could consider transforming both these subqueries into existence subqueries then merging them into a single subquery that pushes the OR condition inside the subquery, doing something like:

Step 1:

Step 2:

The mechanism displayed in step 2 is known as ‘subquery coalescing,’ and it has been available to the optimizer and enabled since 11.2.0.1, and comes complete with the pair of hints /*+ [no_]coalesce_sq() */.

In this example, it’s, fairly obvious that if you had an index on nothing but t2(is_deleted, id) and if (despite the clue in the second column name) there were lots of rows matching the predicates t2.is_deleted=’N’ and t2.id = :p_id then it would make sense to combine the two subqueries so that you would only have to visit those rows once each, and could stop at the first occurrence of either the t1.id2 value or the t1.id3 value. On the other hand, if you had an index like t2(id, id2) there may be no benefit gained from the transformation, so it’s nice that the mechanism can be blocked with a simple hint.

In fact, someone raised a question about this specific example in one of the public Oracle forums because the transformation had had a most undesirable side effect and produced a bad plan that took far longer to run than the best possible plan. Here’s some code (supplied on the forum, but with a few enhancements) to produce a model of what the data looked like:

Here’s the resulting plan:

As usual, when pulling a live execution plan from memory the text of the subquery used for the FILTER at operation 1 has disappeared from the Predicate Information, but you can infer from the predicates reported at operation 3 that subquery coalescing has taken place. For confirmation, you could use explain plan and dbms_xplan.display() to get a report that shows the missing predicate information (being careful to remember that explain plan knows nothing about the bind variable – not even its type – which is why you see to_number(:p_id) in the following:

In fact, if you had reported the outline information in the call to dbms_xplan.display_cursor(), you would also have seen that it included the following two directives:

The key thing to note, however, is the amount of work that Oracle has had to do. For every row in the table, it has executed that subquery. Looking at the definition of table test_t2 (and especially its primary key) it’s easy to see that there can be at most one row where t2.id = :p_id, so why didn’t the optimizer unnest the subquery and use it to drive into test_t1? The answer is that while you would use only one row and one value of t2.id2 to drive a join into test_t1, there are two different columns in test_t1 in the query, and you’d have to do something complicated to unravel the pieces and join to test_t1 twice per row from test_t2 – so the optimizer doesn’t (yet) try it. There are further comments about this complication in the following blog article by Mohamed Houri: https://hourim.wordpress.com/2017/08/12/unnesting-of-coalesced-subqueries/

Since subquery coalescing has blocked a strategy that seems sensible to the human eye it’s worth telling the optimizer not to use the feature just to see if something interesting happens. Add the no_coalesce_sq() hint for the two named subquery blocks to the main query:

The or_expand(@main (1) (2)) hint – that I’ve commented out here – was something I added to force 12.2.0.1 to produce the plan that I got from 19.11.0.0. Here’s the more complicated, but far more efficient, execution plan that appeared:

A key feature that you can spot very easily is that this plan accesses a total of only 16 buffers, handling only a few rows, rather than accessing 200,000 buffers and handling 10,000 rows. It’s clearly a significant reduction in the workload, but how does this plan actually work?

Effectively the optimizer has turned the OR’ed pair of subqueries into a UNION ALL, then unnested the union all, then pushed the join to test_t1 inside the UNION ALL (in a step rather like the reverse of join factorization – which will be reviewed in a future installment). To eliminate rows that have already appeared in the first branch of the UNION ALL the optimizer has then added a filter subquery to the second branch.

You might note that if I had not had a unique (primary key) index on test_t2 there would have been SORT UNIQUE operations applied to the rows selected from test_t2 in both branches before the joins into test_t1.

In effect the query has been transformed into:

I won’t extract and reformat the ‘unparsed’ query from the CBO trace file since it is a little long and messy and adds little value to the SQL sketch above. Notice, however, from the Predicate Information provided by the call to explain plan that where I’ve used not exists(subquery) in the simplified framework, the optimizer actually uses lnnvl( exists( subquery ) ) at operation 8 of the plan. Oracle is just applying its generic ‘does not evaluate to true’ function to allow for possible nulls rather than using a simple ‘not’.

Although the resulting plan looks a little complex it is just a combination of a small number of simple transformations done in the right order, so it’s a little surprising that the optimizer doesn’t find it automatically as it searches its space of transformations. This omission has now been noted as (unpublished) bug 33957043: “Subquery Coalesce prevents Or Expansion/transformation”.

As a reminder about how difficult it is to use hints to control the fine detail of the optimizer, when I checked the Outline Information for this execution plan, it held 35 hints that would be needed in an SQL Plan Baseline to reproduce the plan. Though the list included an or_expand() hint, the no_coalesce_sq() hint didn’t make an appearance.

Summary

Converting an aggregate subquery strategy to an analytic function strategy is a rewrite that many people now do by hand when they spot the option and the numbers look right. The optimizer can do this as an internal transformation, though it doesn’t seem to take the option as often as it could. Unfortunately, there’s no explicit hint to force the transformation (though an unnest() hint may have a lucky side effect), and the only possible hint to block it is the opt_param() hint which may be needed in cases where the transformation introduces a large sorting overhead. In some cases, you may need to help the optimizer to pick the path by simplifying the SQL through the use of virtual columns, but that requirement may become redundant in a future release.

If you have multiple subqueries which look sufficiently similar and are used in a similar fashion, the optimizer can do the equivalent of ‘find the least common multiple’ so that it has to run a smaller number of subqueries, potentially using each call for multiple tests. This transformation has a related pair of hints, so it can be blocked or (if legal) forced. The demonstration of the transformation in this article highlighted a case where the optimizer missed an opportunity for using or expansion after doing subquery coalescing, so it’s worth knowing that the mechanism can be blocked.

A future installment will examine Or Expansion and Join Factorisation.