Using Linux perf: do we need to pass identifying info as arguments to important functions?

TL;DR:

  • One can use Linux perf to collect info about where the statement is spending its time.
  • But you only get function names. You can’t tell which SP/Table/etc it was.
  • perf allows to record variables but for some reason it doesn’t allow to record this->member_var.
  • We could work it around by passing “identifying” value like this->name as arguments to “important” functions like sp_head::execute (run a stored routine). Should we?

Longer:

Sometimes one has to analyze statement execution at a finer detail than ANALYZE FORMAT=JSON has. I was involved in such case recently: an UPDATE statement invoked a trigger which ran multiple SQL statements and invoked two stored functions. ANALYZE FORMAT=JSON showed that the top-level UPDATE statement didn’t have any issues. The issue was inside the trigger but where exactly?

We used Linux perf tool. In MariaDB (and MySQL) stored routine is executed by sp_head::execute(). One can track it like so:

# Add the probe:
perf probe -x `which mariadbd` --add _ZN7sp_head7executeEP3THDb
perf probe -x `which mariadbd` --add _ZN7sp_head7executeEP3THDb%return

# Collect a list of probes 
PROBES=`perf probe -l 'probe_mariadb*' | awk '{ printf " -e %s", $1 } '`;

# Now, PROBES has " -e probe_mariadbd:_ZN7sp_head7executeEP3THDb \
#    -e probe_mariadbd:_ZN7sp_head7executeEP3THDb__return" 

then you can note your session’s thread id (TODO: does this work when using a thread pool?) :
select tid from information_schema.processlist where id=connection_id();
then have pref record the profile and run your query:

perf record $PROBES -t $THREAD_ID
^C

perf script
mariadbd 1625339 [005] 874721.854399: probe_mariadbd:_ZN7sp_head7executeEP3THDb: (55942da55c3a)
mariadbd 1625339 [005] 874722.855064: probe_mariadbd:_ZN7sp_head7executeEP3THDb__return: (55942da55c3a <- 55942da586eb)
mariadbd 1625339 [005] 874722.855102: probe_mariadbd:_ZN7sp_head7executeEP3THDb: (55942da55c3a)
mariadbd 1625339 [005] 874724.855253: probe_mariadbd:_ZN7sp_head7executeEP3THDb__return: (55942da55c3a <- 55942da586eb)

Column #4 is time in seconds. This is nice but it’s not possible to tell which SP is which.

perf allows access to local variables:

perf probe -x `which mariadbd` -V  _ZN7sp_head7executeEP3THDb%return
Available variables at _ZN7sp_head7executeEP3THDb%return
        @<execute+0>
                CSET_STRING     old_query
                Diagnostics_area*       da
                ...
                sp_head*        this
                ...

but when I’ve tried to record this->m_name.str, that failed:

perf probe -x `pwd`/mariadbd  --add '_ZN7sp_head7executeEP3THDb this->m_name.str'
Probe on address 0x880c3a to force probing at the function entry.

this is not a data structure nor a union.
  Error: Failed to add events.

What if the name of the stored function was passed as function argument? I edited MariaDB’s source code and added it (full diff):

--- a/sql/sp_head.cc
+++ b/sql/sp_head.cc
@@ -1192,7 +1192,7 @@
 */
 
 bool
-sp_head::execute(THD *thd, bool merge_da_on_success)
+sp_head::execute(THD *thd, bool merge_da_on_success, const char *sp_name)
 {
   DBUG_ENTER("sp_head::execute");
   char saved_cur_db_name_buf[SAFE_NAME_LEN+1];

Compiled, restarted the server and I was able to add a perf probe that records the name:

perf probe -x `pwd`/mariadbd \
   --add '_ZN7sp_head7executeEP3THDbPKc sp_name:string'
perf record
^C
perf script 

This produced (line breaks added by me):

    mariadbd 1627629 [003] 877069.642164: probe_mariadbd:_ZN7sp_head7executeEP3THDbPKc: (564e917f1c67) 
      sp_name_string="test.func1"
    mariadbd 1627629 [003] 877070.642395: probe_mariadbd:_ZN7sp_head7executeEP3THDbPKc: (564e917f1c67) 
      sp_name_string="test.func2"

sp_name_string shows which stored function was invoked.

The question: Should we now go now and add function arguments:

  • For SP execution, show which SP is executed (like it was done in this post)
  • For an SQL statement, show its text
  • For subquery execution, add its select_number
  • … and so forth ?

This would also help with crashing bugs – the stack trace would be informative. Currently the crash reports get benefit from dispatch_command() function, a random example from MDEV-22262:

#14 0x0000562e90545488 in dispatch_command ( ...
  packet=packet@entry=0x7efdf0007a19 "UPDATE t1 PARTITION (p1) SET a=3 WHERE a=8" ... )

but if the crash happened when running a Prepared Statement, one is out of luck.

One thought on “Using Linux perf: do we need to pass identifying info as arguments to important functions?

Leave a comment