While the active session history extension for PostgreSQL is still in beta, some information is added to it.
The pg_active_session_history view is currently made of:
View "public.pg_active_session_history"
Column | Type | Collation | Nullable | Default
------------------+--------------------------+-----------+----------+---------
ash_time | timestamp with time zone | | |
datid | oid | | |
datname | text | | |
pid | integer | | |
usesysid | oid | | |
usename | text | | |
application_name | text | | |
client_addr | text | | |
client_hostname | text | | |
client_port | integer | | |
backend_start | timestamp with time zone | | |
xact_start | timestamp with time zone | | |
query_start | timestamp with time zone | | |
state_change | timestamp with time zone | | |
wait_event_type | text | | |
wait_event | text | | |
state | text | | |
backend_xid | xid | | |
backend_xmin | xid | | |
top_level_query | text | | |
query | text | | |
cmdtype | text | | |
queryid | bigint | | |
backend_type | text | | |
blockers | integer | | |
blockerpid | integer | | |
blocker_state | text | | |
You could see it as samplings of pg_stat_activity providing more information:
- ash_time: the sampling time
- top_level_query: the top level statement (in case PL/pgSQL is used)
- query: the statement being executed (not normalised, as it is in pg_stat_statements, means you see the values)
- cmdtype: the statement type (SELECT,UPDATE,INSERT,DELETE,UTILITY,UNKNOWN,NOTHING)
- queryid: the queryid of the statement which links to pg_stat_statements
- blockers: the number of blockers
- blockerpid: the pid of the blocker (if blockers = 1), the pid of one blocker (if blockers > 1)
- blocker_state: state of the blocker (state of the blockerpid)
Thanks to the queryid field you are able to link the session activity with the sql activity.
The information related to the blocking activity (if any) has been added recently. Why? To easily drill down in case of session being blocked.
Let’s see how we could display some interesting information in case of blocked session(s), for examples:
- The wait chain
- The seconds in wait in this chain
- The percentage of the total wait time that this chain represents
As PostgreSQL provides recursive query and window functions, let’s make use of them to write this query:
postgres@pgu:~$ cat pg_ash_wait_chain.sql
WITH RECURSIVE search_wait_chain(ash_time,pid, blockerpid, wait_event_type,wait_event,level, path)
AS (
SELECT ash_time,pid, blockerpid, wait_event_type,wait_event, 1 AS level,
'pid:'||pid||' ('||wait_event_type||' : '||wait_event||') ->'||'pid:'||blockerpid AS path
from pg_active_session_history WHERE blockers > 0
union ALL
SELECT p.ash_time,p.pid, p.blockerpid, p.wait_event_type,p.wait_event, swc.level + 1 AS level,
'pid:'||p.pid||' ('||p.wait_event_type||' : '||p.wait_event||') ->'||swc.path AS path
FROM pg_active_session_history p, search_wait_chain swc
WHERE p.blockerpid = swc.pid and p.ash_time = swc.ash_time and p.blockers > 0
)
select round(100 * count(*) / cnt)||'%' as "% of total wait",count(*) as seconds,path as wait_chain from (
SELECT pid,wait_event,path,sum(count) over() as cnt from (
select ash_time,level,pid,wait_event,path,count(*) as count, max(level) over(partition by ash_time,pid) as max_level
FROM search_wait_chain where level > 0 group by ash_time,level,pid,wait_event,path
) as all_wait_chain
where level=max_level
) as wait_chain
group by path,cnt
order by count(*) desc;
Let’s launch this query while only one session is being blocked by another one:
postgres@pgu:~$ psql -f pg_ash_wait_chain.sql
% of total wait | seconds | wait_chain
-----------------+---------+--------------------------------------------
100% | 23 | pid:1890 (Lock : transactionid) ->pid:1888
(1 row)
It means that the pid 1890 is waiting since 23 seconds on the transactionid wait event, while being blocked by pid 1888. This wait chain represents 100% of the blocking activity time.
Now another session comes into the game, query the active session history view one more time:
postgres@pgu:~$ psql -f pg_ash_wait_chain.sql
% of total wait | seconds | wait_chain
-----------------+---------+------------------------------------------------------------------------------
88% | 208 | pid:1890 (Lock : transactionid) ->pid:1888
12% | 29 | pid:1913 (Lock : transactionid) ->pid:1890 (Lock : transactionid) ->pid:1888
(2 rows)
So we still see our first blocking chain. It is now not the only one (so represents 88% of the blocking activity time).
We can see a new chain that represents 12% of the blocking activity time:
- pid 1913 (waiting on transactionid) is blocked since 29 seconds by pid 1890 (waiting on transactionid) that is also blocked by pid 1888.
Let’s commit the transaction hold by pid 1888 and launch the query again 2 times:
postgres@pgu:~$ psql -f pg_ash_wait_chain.sql
% of total wait | seconds | wait_chain
-----------------+---------+------------------------------------------------------------------------------
57% | 582 | pid:1890 (Lock : transactionid) ->pid:1888
40% | 403 | pid:1913 (Lock : transactionid) ->pid:1890 (Lock : transactionid) ->pid:1888
3% | 32 | pid:1913 (Lock : transactionid) ->pid:1890
(3 rows)
postgres@pgu:~$ psql -f pg_ash_wait_chain.sql
% of total wait | seconds | wait_chain
-----------------+---------+------------------------------------------------------------------------------
57% | 582 | pid:1890 (Lock : transactionid) ->pid:1888
40% | 403 | pid:1913 (Lock : transactionid) ->pid:1890 (Lock : transactionid) ->pid:1888
3% | 33 | pid:1913 (Lock : transactionid) ->pid:1890
(3 rows)
As you can see the first two chains are still displayed (as the query does not filter on ash_time) but are not waiting anymore (seconds does not increase) while the last one (new one) is still waiting (seconds increase).
Remarks
- This query and active session history usage is 100% inspired by Tanel Poder oracle script
- The query in this blog post does not filter on ash_time, but it would be easy to adapt to do so (so that you could drill down in a particular time window interval)
- Franck Pachot described other usages of the pg_active_session_history in those posts:
- You can find the script used in this post (as well as others and usage examples) in this github repository
Conclusion
We have seen how the blocking information part of the pg_active_session_history view could help to drill down in case of blocking activity.