Into my first post related to Exadata I provided a perl script to extract real-time metrics from the cells based on their cumulative metrics (see here).
In this post I focus on the information available from the database: Cumulative “cell” Statistics and cumulative “cell” Wait Events related to Exadata (Uwe described the most important ones into this post).
As usual the philosophy behind the 3 scripts used into this post is: extract real-time information from the cumulative views by taking a snapshot each second (default interval) and computes the differences with the previous snapshot.
So to collect Real-Time Wait Events related to Exadata, I use a perl script “system_event.pl” (click on the link and then on the view source button to copy/paste the source code) already used into one of my previous post “Measure oracle real-time I/O performance”.
But this time I focus only on “cell” Wait Events related to Exadata that way :
./system_event.pl event_like='%cell%'
02:30:41 INST_NAME EVENT NB_WAITS TIME_WAITED ms/Wait
02:30:41 BDT1 cell smart file creation 7 950197 135.742
--------------------------------------> NEW
02:30:42 INST_NAME EVENT NB_WAITS TIME_WAITED ms/Wait
02:30:42 BDT1 cell smart file creation 3 543520 181.173
02:30:42 BDT1 cell single block physical read 3 4420 1.473
As you can see, during the last second the database waited 3 times on “Cell smart file creation”
Now to collect Real-Time Statistics related to Exadata, I use the “sysstat.pl” script (click on the link and then on the view source button to copy/paste the source code) that way:
./sysstat.pl statname_like='%cell%'
03:15:29 INST_NAME NAME VALUE
03:15:29 BDT1 cell smart IO session cache lookups 11
03:15:29 BDT1 cell scans 17
03:15:29 BDT1 cell blocks processed by cache layer 3551
03:15:29 BDT1 cell blocks processed by txn layer 3551
03:15:29 BDT1 cell blocks processed by data layer 3551
03:15:29 BDT1 cell blocks helped by minscn optimization 3551
03:15:29 BDT1 cell physical IO interconnect bytes returned by smart scan 1421352
03:15:29 BDT1 cell physical IO interconnect bytes 2756648
03:15:29 BDT1 cell IO uncompressed bytes 29089792
03:15:29 BDT1 cell physical IO bytes eligible for predicate offload 29089792
--------------------------------------> NEW
03:15:30 INST_NAME NAME VALUE
03:15:30 BDT1 cell smart IO session cache lookups 33
03:15:30 BDT1 cell scans 33
03:15:30 BDT1 cell blocks processed by cache layer 9896
03:15:30 BDT1 cell blocks processed by txn layer 9896
03:15:30 BDT1 cell blocks processed by data layer 9896
03:15:30 BDT1 cell blocks helped by minscn optimization 9896
03:15:30 BDT1 cell physical IO interconnect bytes returned by smart scan 6602936
03:15:30 BDT1 cell physical IO interconnect bytes 6701240
03:15:30 BDT1 cell IO uncompressed bytes 81068032
03:15:30 BDT1 cell physical IO bytes eligible for predicate offload 81068032
As you can see during the last second the statistic cell physical IO bytes has increased by 81068032 .
If we need to diagnose more in depth and link in real-time those “cell” statistics with one or more sql_id, we can use the “sqlidstat.pl” (click on the link and then on the view source button to copy/paste the source code) that way:
./sqlidstat.pl statname_like='%cell%'
--------------------------------------> NEW
03:15:29 SID SQL_ID NAME VALUE
03:15:29 ALL 0ab8xuf6kuud5 cell smart IO session cache hits 4
03:15:29 ALL 0ab8xuf6kuud5 cell scans 10
03:15:29 ALL 0ab8xuf6kuud5 cell blocks processed by cache layer 1193
03:15:29 ALL 0ab8xuf6kuud5 cell blocks processed by data layer 1193
03:15:29 ALL 0ab8xuf6kuud5 cell blocks processed by txn layer 1193
03:15:29 ALL 0ab8xuf6kuud5 cell blocks helped by minscn optimization 1193
03:15:29 ALL 0ab8xuf6kuud5 cell physical IO interconnect bytes returned by smart scan 356096
03:15:29 ALL 0ab8xuf6kuud5 cell physical IO interconnect bytes 2232064
03:15:29 ALL 0ab8xuf6kuud5 cell IO uncompressed bytes 9773056
03:15:29 ALL 0ab8xuf6kuud5 cell physical IO bytes eligible for predicate offload 9773056
--------------------------------------> NEW
03:15:30 SID SQL_ID NAME VALUE
03:15:30 ALL 0ab8xuf6kuud5 cell smart IO session cache hits 34
03:15:30 ALL 0ab8xuf6kuud5 cell scans 34
03:15:30 ALL 0ab8xuf6kuud5 cell blocks processed by cache layer 10522
03:15:30 ALL 0ab8xuf6kuud5 cell blocks processed by data layer 10522
03:15:30 ALL 0ab8xuf6kuud5 cell blocks processed by txn layer 10522
03:15:30 ALL 0ab8xuf6kuud5 cell blocks helped by minscn optimization 10522
03:15:30 ALL 0ab8xuf6kuud5 cell physical IO interconnect bytes returned by smart scan 6039016
03:15:30 ALL 0ab8xuf6kuud5 cell physical IO interconnect bytes 6039016
03:15:30 ALL 0ab8xuf6kuud5 cell IO uncompressed bytes 86196224
03:15:30 ALL 0ab8xuf6kuud5 cell physical IO bytes eligible for predicate offload 86196224
I removed the “INST_NAME” column for lisibilty.
By default all the SID have been aggregated but you could also filter on a particular sid (see the help).
So we can see that during the last second, most of the offload processing that we observed at the database level is related to the sql_id 0ab8xuf6kuud5.
Remarks :
- Those 3 scripts are not exclusively related to Exadata as you can use them on all the wait events or statistics, I simply used them with the event_like or statname_like arguments to focus on ‘%cell%’ .
- You can choose the number of snapshots to display and the time to wait between snapshots.
- The scripts are oracle RAC aware : you can work on all the instances, a subset or the local one.
- You have to set oraenv on one instance of the database you want to diagnose first.
Part II of this subject available here.