Global Processlist in Percona Monitoring and ManagementOne piece of feedback I often hear from users of Percona Monitoring and Management  (PMM) is that while the Query Analytics feature is great and provides a lot of insights into queries the server handled, it can’t help us to see which queries are running now.

Problem Statement

Real-time access to queries that are running right now can be extremely helpful in case of pileups if the optimizer gets crazy, a bad query id is deployed, or some unexpected locking situation takes place. The usual result is that many queries of the same kind pile up… and if you’re not lucky, they may not complete for many minutes or even hours, all this time invisible in PMM.

Proposed Solution

In addition to Query History, what Query Analytics really provides now is access to “Live Queries”.  This basically gathers currently-running queries from all the nodes that a user currently observes (could be one node or could be a hundred), where queries can be grouped and sliced in a way similar to how Query Analytics works.

For example, for a given a QueryId (Query Pattern), we can see how many instances of such a query are running right now, what the maximum and average execution time is so far, what database hosts and what databases it is active for, what client IPs and users this query are coming from, etc.

Some other Query Analytics features such as EXPLAIN for a query, information about involved tables, etc., also remain relevant for running queries too.

Also, working with current events and not just history means we can do more than just observe them. I could imagine killing some particular query instance or even all queries which match a particular pattern, which would be handy too.

What do you think? Would having such a “Global Processlist” feature in Percona Monitoring and Management be helpful for you?   Anything else we should consider? Let me know in the comments!

11 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
tanuj

it will be extremely helpful to have a processlist view for any of the database choosen and have the option to kill the query /pid

peterzaitsev

Thanks!

rautamiekka

Sounds like very useful !

Daniel

Adding a “quick analysis” or “live analysis” section to PMM can be a huge WIN. Here’s an example: Currently while using PMM to monitor Postgres RDS everything related to prometheus (actually Victoriametrics) works fine in terms of data freshness. However, QAN get lagged due the amount of nodes being monitored.

The workaround was to write a python script that get the last 1MB of the pg log, run pgBadger to analyze it and then use that for “hot debug”

What I will like to see on PMM? Something like that integrated, that doesn’t necessary go through the regular QAN code path but that feed the ClickHouse db so one can use the “Query Analyzer” dashboard. A kind of a merge.

For MySQL it can be the same with a pt-query-digest integrated, etc etc.

rautamiekka

+1

Jean-François Gagné

This looks useful. In addition to that, the possibility to remotely run ps-top on a server. The use-case: a MySQL node is not healthy, in addition to knowing what is running now, I wan to see which table are queried (both in number of ops per second and latency as shown by ps-top) and which files are read from and written to (as shown by ps-top). The other view of ps-top are probably useful, but these 3 views would already be a good start.

Aakash

Sounds good. Also,it will be much helpful if we have options to list / ignore sleep or queries from particular host or particular pattern of queries along with kill and explain a connection as well.

Francisco Miguel Biete Banon

How different would this be to use performance_schema as the source of QAN data?
We have custom dashboards that show the current processlist of the server and allows to kill sessions, but we are missing the option to explain an active statement and see its history in the same screen.

Daniel

You already can use peformance_schema as the source of QAN. The problem is that for example in Postgres there’s no P_S and the monitoring extension that Percona made is not available on RDS so one is stuck

Thomas

Absolutely yes. When I check an unknown system, I always run “select * from information_schema.processlist where time > 1 and command != ‘Sleep’;” multiple times per minute to get an overall picture.

Vinay Pandey

Must-have feature in PMM, also it would be great if we could send the current processes list to the number of email ids.