All of the above might seem to be obvious, but I wanted to frame the rest of this post with a logical starting point. Next let’s see what kinds of things about queries might indicate that they aren’t optimal.
A responsible approach to a task such as finding bad queries begins with a definition of badness that we can be pretty sure is complete. I am interested in knowing whether I’m missing anything, but I believe the definition of badness in this article is fairly complete. (Post comments if I’m wrong.)
Here is my definition:
- Queries are bad either singly (an individual query is bad) or in aggregate (a group or class of queries is bad). I think this is a complete definition because the only other kind of number besides “one” and “many” is “zero”, and a nonexistent query can’t be bad.
- Queries, or groups of queries, are bad because
- they are slow and provide a bad user experience, or
- they add too much load to the system, or
- they block other queries from running
- It finds queries (or groups of queries) that are slow and provide a bad user experience by the parameters to the –outliers option. If a query is slower than X more than N times, it’s bad. You can look at factors such as the variance-to-mean ratio, the Apdex score, and the explain sparkline to see whether the query is likely to be possible to optimize.
Now, given the definition above, how do we find these queries in some input such as a log of queries? It shouldn’t surprise you that there is a working implementation of most of this in a tool, Maatkit’s mk-query-digest. Here’s how:
Read more: MySQL Performance Blog