Advanced Oracle Troubleshooting Guide, Part 13: OStackProf for Linux, Unix & MacOSX Clients

2022-09-17

About 14 years ago I published the OstackProf script that ran on Windows sqlplus clients (the Oracle server could run on any platform that supported ORADEBUG SHORT_STACK).

I don’t use Windows almost at all nowadays, so it’s about time to publish the same functionality for MacOSX, Linux, Unix sqlplus clients. This is mostly useful when hacking dev systems, troubleshooting test systems and in rare cases, carefully troubleshooting non-fatal processes in production too.

OStackProf is just a .sql script that you run on your sqlplus client. No need to log in to the server, ORADEBUG SHORT_STACK can fetch server process stacks (by sending a SIGUSR2 signal to the target process) from a remote server to your client machine. Look into the script to see how it exactly works. At high level, it just runs a number of ORADEBUG SHORT_STACK commands, spools these to a local file on your client workstation and then uses either a .VBS or .py script to post-process and summarize the the stack traces locally.

This is why I have split the scripts into two files, one for Unix-like clients and the other for Windows:

SQL> @help ostack

NAME                      DESCRIPTION                                            USAGE
------------------------- ------------------------------------------------------ -----------------------------------------
ostackprofu.sql           Sample Oracle process call stacks and show a profile   @ostackprofu <SID> <SLEEP> <NUM_SAMPLES>
                          (Unix/Linux/Mac sqlplus)                               @ostackprofu 123 0.1 100

ostackprofw.sql           Sample Oracle process call stacks and show a profile   @ostackprofw <SID> <SLEEP> <NUM_SAMPLES>
                          (Windows sqlplus)                                      @ostackprofw 123 0.1 100

As I already described how to use and interpret this tool back in 2008, I won’t write much more here, just example output from running ostackprofu.sql on my Mac OSX laptop, connected to an Oracle database running on a Linux server:

SQL> @ostackprofu 480 0.1 100

-- oStackProf v1.2 - EXPERIMENTAL script by Tanel Poder ( https://tanelpoder.com )

WARNING! This script can crash the target process on Oracle 9.2 on Windows
and maybe other versions/platforms as well. Test this script out thorouhgly
in your dev environment first!
Hit CTRL+C to cancel, ENTER to continue...

Sampling...

Below is the stack prefix common to all samples:
------------------------------------------------------------------------
Frame->function()
------------------------------------------------------------------------
# 48  __libc_start_main()
# 47   main()
# 46    ssthrdmain()
# 45     opimai_real()
# 44      sou2o()
# 43       opidrv()
# 42        opirip()
# 41         kkjrdp()
# 40          kkjsexe()
# 39           rpiswu2()
# 38            jslve_cdb_execute()
# 37             kpdbSwitch()
# 36              jslve_execute()
# 35               jslve_execute0()
# 34                kpdbSwitch()
# 33                 jslvCDBSwitchUsr()
# 32                  jslvswu()
# 31                   jslvec_execcb()
# 30                    OCIStmtExecute()
# 29                     kpuexec()
# 28                      kpurcsc()
# 27                       upirtrc()
# 26                        kpoodr()
# 25                         opiodr()
# 24                          kpoal8()
# 23                           opiexe()
# 22                            kkxexe()
# 21                             peicnt()
# 20                              plsql_run()
# 19                               pfrrun()
# 18                                pfrrun_no_tool()
# 17                                 pfrinstr_EXECC()
# 16                                  pevm_EXECC()
# 15                                   psdnal()
# 14                                    psddr0()
# 13                                     rpidrv()
# 12                                      rpiswu2()
# 11                                       rpidru()
# 10                                        skgmstack()
#  9                                         rpidrus()
#  8                                          opiodr()
#  7                                           opipls()
#  6                                            opiexe()
#  5                                             updexe()
#  4                                              updThreePhaseExe()
#  3                                               updaul()
#  2                                                qerupFetch()
#  1                                                 qerupUpdRow() <-- all 100 stack traces contain the function call chain until this function

----------------------------------------------------------------------
-- # Num additional functions in call stack
----------------------------------------------------------------------
  18   ->updrow()->kddlkr()->kdddgb() <-- this is the top additional call chain "ending" on top of the common stack. As I took 100 samples with OstackProf, the 18 means roughly 18% of time
   8   ->updrow()->kddlkr()->kdddgb()->ktbgcur()->kcbgcur()
   7   ->updrow()
   6   ->updrow()->kddlkr()->kdddgb()->ktbgfi()
   5   ->updrow()->kddlkr()->kdddgb()->ktbgcur()->kcbgcur()->kcbzgs()->kssadf_numa_intl()
   5   ->updrow()->kddlkr()
   5   ->qerixtFetch()
   4   ->qerixtFetch()->kdifxs0()
   4   ->qerixtFetch()->kafgex1()
   3   ->updrow()->kddlkr()->kdddgb()->ktbgcur()->kcbgcur()->kpdbBufCacheOpen()
   3   ->updrow()->kddlkr()->kdddgb()->ktbgcur()->kcbgcur()->kcbs_simulate()->kcbsacc()
   3   ->updrow()->kddlkr()->kcbrls()->kcbzar()
   3   ->updrow()->kddlkr()->kcbrls()
   2   ->updrow()->kddlkr()->kdddgb()->ktbgfi()->ktb4GetItlScn8()
   2   ->updrow()->kddlkr()->kdddgb()->ktbgcur()
   2   ->updrow()->kddlkr()->kdddcp()
   2   ->updrow()->kddlkr()->kcbrls()->kcbzfs()->kssrmf_numa_intl()
   2   ->updrow()->dmlsrvRetryLog()
   1   ->updrow()->updicf()
   1   ->updrow()->ktifspt()
   1   ->updrow()->ktcspGrabInternalSavepoint()
   1   ->updrow()->kskchk()
   1   ->updrow()->kddlkr()->kdddgb()->ktcgtx()
   1   ->updrow()->kddlkr()->kdddgb()->ktbgfi()->ktb4GetItlScn()
   1   ->updrow()->kddlkr()->kdddgb()->ktbgfi()->kscn_to_ub8_impl()
   1   ->updrow()->kddlkr()->kdddgb()->ktbgcur()->kcbgcur()->ksolshash()
   1   ->updrow()->kddlkr()->kdddgb()->ktbgcur()->kcbgcur()->kcbzgs()
   1   ->updrow()->kddlkr()->kdddgb()->ktbgcur()->kcbgcur()->kcbs_simulate()
   1   ->updrow()->kddlkr()->kdddgb()->kdrreb2()
   1   ->updrow()->kddlkr()->kdddgb()->kddlrc()
   1   ->updrow()->kddlkr()->kcbrls()->kssrmf_numa_intl()
   1   ->updrow()->dmlsrvRetryLog()->kgghash()
   1   ->qerixtFetch()->kdifxs0()->kcbrls()->kcbzfs()->kssrmf_numa_intl()

This stack profile is actually from a real-life problem that I recently troubleshooted, will blog/livestream about this someday in the future.

That’s all!

Download the latest version of my TPT scripts from GitHub or a .zip file.

Enjoy!


  1. I am finally close to launching the completely rebuilt 2024 versions of my Linux & AOT classes in my Learning Platform! (Updates to SQL Tuning class in H2 2024):
    Advanced Oracle SQL Tuning training. Advanced Oracle Troubleshooting training, Linux Performance & Troubleshooting training. Check them out!
  2. Get randomly timed updates by email or follow Social/RSS