Introduction
The purpose of this post is to share a way to aggregate by Oracle database within SystemTap probes. Let’s describe a simple use case to make things clear.
Use Case
Let’s say that I want to get the number and the total size of TCP messages that have been sent and received by an Oracle database. To do so, let’s use 2 probes:
and fetch the command line of the processes that trigger the event thanks to the cmdline_str() function. In case of a process related to an oracle database, the cmdline_str() output would look like one of those 2:
- ora_<something>_<instance_name>
- oracle<instance_name> (LOCAL=<something>
So let’s write two embedded C functions to extract the Instance name from each of the 2 strings described above.
Functions
get_oracle_name_b:
For example, this function would extract BDT from “ora_dbw0_BDT” or any “ora_<something>_BDT” string.
The code is the following:
function get_oracle_name_b:string (mystr:string) %{
char *ptr;
int ch = '_';
char *strargs = STAP_ARG_mystr;
ptr = strchr( strchr( strargs , ch) + 1 , ch);
snprintf(STAP_RETVALUE, MAXSTRINGLEN, "%s",ptr + 1);
%}
get_oracle_name_f:
For example, this function would extract BDT from “oracleBDT (LOCAL=NO)”, “oracleBDT (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))” or any “oracleBDT (LOCAL=<something>” string.
The code is the following:
function get_oracle_name_f:string (mystr:string) %{
char *ptr;
int ch = ' ';
char substr_res[30];
char *strargs = STAP_ARG_mystr;
ptr = strchr( strargs, ch );
strncpy (substr_res,strargs+6, ptr - strargs - 6);
substr_res[ptr - strargs - 6]='0';
snprintf(STAP_RETVALUE, MAXSTRINGLEN, "%s",substr_res);
%}
Having in mind that the SystemTap aggregation operator is “<<<” (as explained here) we can use those 2 functions to aggregate within the probes by Instance Name (passing as parameter the cmdline_str()) that way:
probe tcp.recvmsg
{
if ( isinstr(cmdline_str() , "ora_" ) == 1 && uid() == orauid) {
tcprecv[get_oracle_name_b(cmdline_str())] <<< size
} else if ( isinstr(cmdline_str() , "LOCAL=" ) == 1 && uid() == orauid) {
tcprecv[get_oracle_name_f(cmdline_str())] <<< size
} else {
tcprecv["NOT_A_DB"] <<< size
}
}
probe tcp.sendmsg
{
if ( isinstr(cmdline_str() , "ora_" ) == 1 && uid() == orauid) {
tcpsend[get_oracle_name_b(cmdline_str())] <<< size
} else if ( isinstr(cmdline_str() , "LOCAL=" ) == 1 && uid() == orauid) {
tcpsend[get_oracle_name_f(cmdline_str())] <<< size
} else {
tcpsend["NOT_A_DB"] <<< size
}
}
As you can see, non oracle database would be recorded and displayed as “NOT_A_DB”.
Based on this, the tcp_by_db.stp script has been created.
tcp_by_db.stp: script usage and output example
Usage:
$> stap -g ./tcp_by_db.stp <oracle uid> <refresh time ms>
Output Example:
$> stap -g ./tcp_by_db.stp 54321 10000
---------------------------------------------------------
NAME NB_SENT SENT_KB NB_RECV RECV_KB
---------------------------------------------------------
VBDT 5439 8231 10939 64154
NOT_A_DB 19 4 41 128
BDT 19 50 35 259
---------------------------------------------------------
NAME NB_SENT SENT_KB NB_RECV RECV_KB
---------------------------------------------------------
VBDT 267 109 391 2854
NOT_A_DB 102 19 116 680
BDT 26 55 47 326
---------------------------------------------------------
NAME NB_SENT SENT_KB NB_RECV RECV_KB
---------------------------------------------------------
VBDT 340 176 510 2940
NOT_A_DB 150 8 151 1165
BDT 42 77 61 423
Remarks:
- The oracle uid on my machine is 54321
- The refresh time has been set to 10 seconds
- You can see the aggregation for 2 databases on my machine and also for all that is not an oracle database
Whole Script source code
The whole code is available at:
Conclusion
Thanks to the embedded C functions we have been able to aggregate by Oracle database within SystemTap probes.