Sit Down. We Need to Talk About Your Scalar Functions.

Thanks for coming today. I apologize for baiting you in here with a picture of toilet paper. That’s not even in the US – that’s from my Iceland trip, before I flew back here to the US. Look, we all make mistakes. That’s not the point of this post.

You see, I’ve been reading this KB article about bugs in scalar function inlining. SQL Server 2019 tries to inline your functions using a technology they called Froid, but ever since Froid came out, it’s had one bug after another.

And I don’t think it’s Microsoft’s fault.

I think it’s your fault.

Because reading this KB article, I’m increasingly convinced that y’all are completely insane. You’re doing things with functions that I can’t even comprehend doing once, let alone doing so often that you feel the need to encapsulate the logic in a function and use it over and over again. Microsoft’s having to deal with functions that shouldn’t have ever passed a code review, let alone gone into production.

Here, you read it – and instead of reading it as a bug list, read it as a Microsoft support person who’s hearing a customer complain about inaccurate function results when their function has ___ in it:

 


This cumulative update includes several fixes across the following areas for scenarios in which a query that uses Scalar UDF Inlining may return an error or unexpected results:
  • Type mismatch error if the return type of the UDF is sql_variant (added in Microsoft SQL Server 2019 CU2)
  • UDFs referencing labels without an associated GOTO command return incorrect results (added in Microsoft SQL Server 2019 CU2)
  • Uninitialized variables used in condition (IF-ELSE) statements cause errors (added in Microsoft SQL Server 2019 CU2)
This cumulative update also blocks Inlining in the following scenarios:
  • If the UDF references certain intrinsic functions (e.g. @@ROWCOUNT) that may alter the results when inlined (added in Microsoft SQL Server 2019 CU2)
  • Aggregate functions being passed as parameters to a scalar UDF (added in Microsoft SQL Server 2019 CU2)
  • If the UDF references built-in views (for example: OBJECT_ID) (added in Microsoft SQL Server 2019 CU2)
  • If the UDF uses XML methods (added in Microsoft SQL Server 2019 CU4)
  • If the UDF contains a SELECT with ORDER BY without a “TOP 1” (added in Microsoft SQL Server 2019 CU4)
  • If the SELECT query performs an assignment in conjunction with the ORDER BY clause (e.g. SELECT @x = @x +1 FROM table ORDER BY column_name) (added in Microsoft SQL Server 2019 CU4)
  • If the UDF contains multiple RETURN statements (added in Microsoft SQL Server 2019 CU5)

 

When writing functions like that, your scientists were so preoccupied with whether they could that they didn’t stop to think if they should.

If you have scalar functions with anything but a SELECT in it, you should probably stop holding out hope that SQL Server 2019 is going to save you, and start rolling up your sleeves to rewrite those functions. By the time CU9 comes out, I’m kinda expecting scalar function inlining to be disabled altogether, and for Joe Sack to have thrown himself off a cliff in despair, having seen so many of your terrible functions.

Previous Post
How to Upgrade SQL Server Evaluation Edition to Developer Edition
Next Post
Things To Consider When Submitting Sessions to Paid Online Conferences

24 Comments. Leave new

  • Erik Darling
    June 25, 2020 8:25 am

    Yes, I’ve also received some interesting feedback about the utility of the feature.

    Reply
  • Emanuele Meazzo
    June 25, 2020 8:53 am

    Rule #1 of performance optimization is “Don’t do stupid stuff” , makes sense

    Reply
  • Leland Steffensen
    June 25, 2020 9:41 am

    Stupid stuff is what keeps me employed.

    Reply
    • Exactly! I understand the reason programmers want to use udfs and its a valid language feature to help with encapsulation, right? But if people started thinking about the consequences I’d have way fewer contracts 🙂

      Reply
  • As I always say to people – just because it’s available doesn’t mean it makes sense to do it that way.

    I was a little more extreme, we basically banned UDFs in the join & where clause cos of the stupid stuff people would do – and the fact that UDFs cause the profiler to fall over meaning we couldn’t work out where the app was going pear-shaped (even if it wasn’t the UDF, and it usually was).

    Reply
  • Paul-Sebastian Manole
    June 25, 2020 10:26 am

    Well, I think valid SQL syntax shouldn’t throw errors or return invalid results. How are we supposed to know what is allowed and what is not, if the code is valid SQL? So yeah, to me, these are bugs caused by the system, not how the users choose use it.

    Reply
  • “With great powers comes great stupidity”

    Reply
  • Heh… a play of words on that last one….

    “With code comes great stupidity”. 🙂

    Reply
  • Clever girl …

    Reply
  • I’m OK with CTEs but UDFs – careful what you wish for 🙂 Historically, I’ve always believed that that UDFs (and complex triggers) were made available to mask over a lack of capability within the App layer – we can’t easily do this in VB6 so lets push some business logic into the database layer to make life easier. No excuse for that behaviour these days right ? Oh wait “technical debt”….

    Reply
  • can’t like the argumentation. who can make this definition that a code is stupid? i only see one clearing instance here and this is the compiler. if a compiler compiles a code without errors, it is valid and must behave correctly and valid with the features from the server. this is imho the only valid definition. sorry brent –> your mail is emotional and unnecessary

    Reply
    • Dietmar – this isn’t mail. It’s a blog post.

      Thanks for stopping by!

      Reply
      • thanks for the clarification. thus: this blog post is emotional, personal and (in contrast to a lot of good information from you) unnecessary. otherwise, congratulations on your excellent media presence!

        Reply
        • right, that’s why it’s a blog post and not on wikipedia

          you don’t have to like everything you read

          Reply
    • Steven Knox
      June 26, 2020 9:47 am

      Your comment puts me in mind of the “obfuscated C” contest ( ioccc . org ) *.

      All of the winners of this contest over the years are acceptable code by your definition, yet few to none of them would be deployed to a production environment in any rational organization.

      “It compiles” is not the end of code validation; it’s just the start.

      * TIL the obfuscated C contest continues to this day.

      Reply
  • David Forck
    June 26, 2020 8:57 am

    For a long time now I’ve always seen functions in SQL Server as a hold over from programmers that don’t really understand that SQL is a set-based language

    Reply
    • H.M.Müller
      June 28, 2020 8:25 am

      All set-based calculi come with functions, also on single values. After all, even + and CASE WHEN are (value-level, not set-level) functions, and thus, they must, in modern times, be first-class citizens = one must be able to define new ones oneself. So no, “set-based” does not at all contradict “with scalar functions”.

      Reply
  • Douglas Coats
    June 26, 2020 10:09 am

    Some of the comments here are on par with the post itself lol.

    As much as I have been waiting for this feature, I still shy away from scalar functions for table view functions. I know Im weak lol

    Reply
  • Heh… “if a compiler compiles a code without errors, it is valid and must behave correctly…”. Please do keep thinking that because it makes for a target rich environment for those that know better.

    Reply
  • So what does “FROID” actually stand for? Could it be “First Release is Over and It Died”? 😀 😀 😀

    Reply
  • While I’m late to the game with this comment.
    “If it compiles it is OK” may be sufficient for a very junior programmer.
    Anyone competent is expected to consider the performance & suitability of algorithm in solving the problem.
    In SQL that means looking at the execution plan to see if the work that it is doing is reasonable for the results you obtain.
    eg: If you want the total of all rows, a table scan is reasonable. If you only want one row, a table scan is sub-optimal

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.