Graphs – code.openark.org http://shlomi-noach.github.io/blog/ Blog by Shlomi Noach Thu, 03 Mar 2011 07:41:27 +0000 en-US hourly 1 https://wordpress.org/?v=5.3.3 32412571 Generating Google line charts with SQL, part II https://shlomi-noach.github.io/blog/mysql/generating-google-line-charts-with-sql-part-ii https://shlomi-noach.github.io/blog/mysql/generating-google-line-charts-with-sql-part-ii#comments Thu, 03 Mar 2011 07:38:10 +0000 https://shlomi-noach.github.io/blog/?p=3337 This post continues Generating Google line charts with SQL, part I, in pursue of generating time series based image charts.

We ended last post with the following chart:


http://chart.apis.google.com/chart?cht=lc&chs=400x200&chtt=SQL%20chart&chxt=x,y&chxr=1,-4716.6,5340.0&chd=s:dddddddddeeeeeefffffffffeeeedddcccbbaaZZZYYYXXXXXXXXXYYYZZabbcdeefghhijkkllmmmmmmmmllkkjihgfedcbZYXWVUTSRRQQPPPPQQQRSTUVWXZacdfgijlmnpqrssttuuuttssrqonmkigfdbZXVTSQONMLKJIIIIIIJKLMOPRTVXZbegilnprtvwyz01111110zyxvtrpnkifcaXUSPNLJHFECBBAAABBCEFHJLNQTWZcfilortwy1346789999876420yvspmjfcYVSOL

which has a nice curve, and a proper y-legend, but incorrect x-legend and no ticks nor grids.

To date, Google Image Charts do not support time-series charts. We can’t just throw timestamp values and expect the chart to properly position them. We need to work these by hand.

This is not easily done; if our input consists of evenly spread timestamp values, we are in a reasonable position. If not, what do we do?

There are several solutions to this:

  • We can present whatever points we have on the chart, making sure to position them properly. This makes for an uneven distribution of ticks on the x-axis, and is not pleasant to watch.
  • We can extrapolate values for round hours (or otherwise round timestamp resolutions), and so show evenly spread timestamps. I don’t like this solution one bit, since we’re essentially inventing values here. Extrapolation is nice when you know you have nice curves, but not when you’re doing database monitoring, for example. You must have the precise values.
  • We can do oversampling, then group together several measurements within round timestamp resolutions. For example, we can make a measurement every 2 minutes, yet present only 6 measurements per hour, each averaging up 10 round minutes. This is the approach I take with mycheckpoint.

The latest approach goes even beyond that: what if we missed 30 minutes of sampling? Say the server was down. We then need to “invent” the missing timestamps. Note that we invent the timestamps, we do not invent values. We must present the chart with missing values on our invented timestamps.

I may show how to do this in a future post. Meanwhile, let’s simplify and assume our values are evenly spread.

Sample data

We use google_charts.sql. Note that the timestamp values provided in Part I of this post is skewed, so make sure to use this file.

x-axis values

We use chxl to present with x-axis values. We may be tempted to just list all values. Would that work?

Sadly, no, for two reasons:

  1. Google is not smart enough; whatever we throw at it, it will try to present. So, if we have 288 rows, that’s 288 x-axis values. Not enough room, to be sure! Smarter implementations would automatically hide some values, so as only to present with non-overlapping values.
  2. Our URL will turn out to be too long. Remember: 2048 characters is our maximum limit for GET request!

Also, we must format our timestamp to be of minimal width. In our example, we have a 24 hour range. We therefore present timestamps in hh:MM format. So, a naive approach would be to:

SELECT
  CONCAT(
    'http://chart.apis.google.com/chart?cht=lc&chs=400x200&chtt=SQL%20chart&chxt=x,y&chxr=1,',
    ROUND(min_value, 1), ',',
    ROUND(max_value, 1),
    '&chd=s:',
    GROUP_CONCAT(
      IF(
        data IS NULL,
        '_',
        SUBSTRING(
          'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789',
          1+61*(data - min_value)/(max_value - min_value),
          1
        )
      )
      SEPARATOR ''
    ),
   '&chxl=0:|',
    GROUP_CONCAT(
      DATE_FORMAT(ts, '%H:%i')
      SEPARATOR '|'
    )
  ) FROM chart_data, chart_data_minmax

The resulting URL is just too long.

Solution? Let’s only consider round hour timestamps! Our next attempt looks like this (we also throw in chxs, to show ticks):

SELECT
  CONCAT(
    'http://chart.apis.google.com/chart?cht=lc&chs=400x200&chtt=SQL%20chart&chxt=x,y&chxr=1,',
    ROUND(min_value, 1), ',',
    ROUND(max_value, 1),
    '&chd=s:',
    GROUP_CONCAT(
      IF(
        data IS NULL,
        '_',
        SUBSTRING(
          'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789',
          1+61*(data - min_value)/(max_value - min_value),
          1
        )
      )
      SEPARATOR ''
    ),
   '&chxs=0,505050,10,0,lt',
   '&chxl=0:|',
    GROUP_CONCAT(
      IF(
        MINUTE(ts) = 0,
        DATE_FORMAT(ts, '%H:%i'),
        NULL
      )
      SEPARATOR '|'
    )
  ) FROM chart_data, chart_data_minmax

and results with:


http://chart.apis.google.com/chart?cht=lc&chs=400x200&chtt=SQL%20chart&chxt=x,y&chxr=1,-4716.6,5340.0&chd=s:dddddddddeeeeeefffffffffeeeedddcccbbaaZZZYYYXXXXXXXXXYYYZZabbcdeefghhijkkllmmmmmmmmllkkjihgfedcbZYXWVUTSRRQQPPPPQQQRSTUVWXZacdfgijlmnpqrssttuuuttssrqonmkigfdbZXVTSQONMLKJIIIIIIJKLMOPRTVXZbegilnprtvwyz01111110zyxvtrpnkifcaXUSPNLJHFECBBAAABBCEFHJLNQTWZcfilortwy1346789999876420yvspmjfcYVSOL&chxs=0,505050,10,0,lt&chxl=0:|00:00|01:00|02:00|03:00|04:00|05:00|06:00|07:00|08:00|09:00|10:00|11:00|12:00|13:00|14:00|15:00|16:00|17:00|18:00|19:00|20:00|21:00|22:00|23:00

Too messy, isn’t it?

A word about ticks

You would think: OK, then, let’s just present every 4 round hours timestamps. But there’s a catch: a tick will show only when there’s an x-axis value. It’s nice to have a tick for every hour, but we only want to present values every 4 hours.

Fortunately, we can provide with an unseen value: a space (URL encoded as ‘+‘). So we complicate things up a bit on the chxl to read:

SELECT
  CONCAT(
    'http://chart.apis.google.com/chart?cht=lc&chs=400x200&chtt=SQL%20chart&chxt=x,y&chxr=1,',
    ROUND(min_value, 1), ',',
    ROUND(max_value, 1),
    '&chd=s:',
    GROUP_CONCAT(
      IF(
        data IS NULL,
        '_',
        SUBSTRING(
          'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789',
          1+61*(data - min_value)/(max_value - min_value),
          1
        )
      )
      SEPARATOR ''
    ),
   '&chxs=0,505050,10,0,lt',
   '&chxl=0:|',
    GROUP_CONCAT(
      IF(
        MINUTE(ts) = 0,
        IF(
          HOUR(ts) MOD 4 = 0,
          DATE_FORMAT(ts, '%H:%i'),
          '+'
        ),
        NULL
      )
      SEPARATOR '|'
    )
  ) FROM chart_data, chart_data_minmax

and get:


http://chart.apis.google.com/chart?cht=lc&chs=400x200&chtt=SQL%20chart&chxt=x,y&chxr=1,-4716.6,5340.0&chd=s:dddddddddeeeeeefffffffffeeeedddcccbbaaZZZYYYXXXXXXXXXYYYZZabbcdeefghhijkkllmmmmmmmmllkkjihgfedcbZYXWVUTSRRQQPPPPQQQRSTUVWXZacdfgijlmnpqrssttuuuttssrqonmkigfdbZXVTSQONMLKJIIIIIIJKLMOPRTVXZbegilnprtvwyz01111110zyxvtrpnkifcaXUSPNLJHFECBBAAABBCEFHJLNQTWZcfilortwy1346789999876420yvspmjfcYVSOL&chxs=0,505050,10,0,lt&chxl=0:|00:00|+|+|+|04:00|+|+|+|08:00|+|+|+|12:00|+|+|+|16:00|+|+|+|20:00|+|+|+

OK, I cheated

Who says sample data starts with a round hour? We have that hidden assumption here, since the first tick is necessarily a round hour in our code. Yet our data may start at 12:35, for example. Sorry, you’ll have to dig into mycheckpoint’s source code to see a thorough solution. It’s just too much for this post.

Grids

Let’s wrap this up with grids. Grids work by specifying the step size (in percent of overall height/width) and initial offset (again, in percent).

Wouldn’t it be nicer if grids were automatically attached to ticks? I mean, REALLY! What were those guys thinking? (I know, they’re doing great work. Keep it up!)

Problem is, I have no idea how Google chooses to distribute values on the y-axis. I don’t know where y-axis ticks will be placed. So on y-axis, I just choose to split charts to 4 even parts, and draw horizontal grids between them. Percent is 25 (100/4), offset is 0.

But I do have control over the x-axis. In our case, I know how many ticks we’ll be having. Plus, I made life easier by assuming we start with a round hour, so no offset is required.

Umm… How many ticks do we have? Easy: the number of round hours. This can be calculated by: SUM(MINUTE(ts) = 0. Actually, we need to take 1 off.

We now build the chg parameter:

SELECT
  CONCAT(
    'http://chart.apis.google.com/chart?cht=lc&chs=400x200&chtt=SQL%20chart&chxt=x,y&chxr=1,',
    ROUND(min_value, 1), ',',
    ROUND(max_value, 1),
    '&chd=s:',
    GROUP_CONCAT(
      IF(
        data IS NULL,
        '_',
        SUBSTRING(
          'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789',
          1+61*(data - min_value)/(max_value - min_value),
          1
        )
      )
      SEPARATOR ''
    ),
   '&chxs=0,505050,10,0,lt',
   '&chxl=0:|',
    GROUP_CONCAT(
      IF(
        MINUTE(ts) = 0,
        IF(
          HOUR(ts) MOD 4 = 0,
          DATE_FORMAT(ts, '%H:%i'),
          '+'
        ),
        NULL
      )
      SEPARATOR '|'
    ),
   '&chg=', ROUND(100.0/((SUM(MINUTE(ts) = 0) -1)), 2), ',25,1,2,0,0'
  ) FROM chart_data, chart_data_minmax

and get:


http://chart.apis.google.com/chart?cht=lc&chs=400x200&chtt=SQL%20chart&chxt=x,y&chxr=1,-4716.6,5340.0&chd=s:dddddddddeeeeeefffffffffeeeedddcccbbaaZZZYYYXXXXXXXXXYYYZZabbcdeefghhijkkllmmmmmmmmllkkjihgfedcbZYXWVUTSRRQQPPPPQQQRSTUVWXZacdfgijlmnpqrssttuuuttssrqonmkigfdbZXVTSQONMLKJIIIIIIJKLMOPRTVXZbegilnprtvwyz01111110zyxvtrpnkifcaXUSPNLJHFECBBAAABBCEFHJLNQTWZcfilortwy1346789999876420yvspmjfcYVSOL&chxs=0,505050,10,0,lt&chxl=0:|00:00|+|+|+|04:00|+|+|+|08:00|+|+|+|12:00|+|+|+|16:00|+|+|+|20:00|+|+|+&chg=4.35,25,1,2,0,0

Phew!

Conclusion

So we haven’t worked on offsets. And, this is a single line chart. What about multiple lines? Legend? The following chart:


is harder to achieve. I’m leaving this up to you!

]]>
https://shlomi-noach.github.io/blog/mysql/generating-google-line-charts-with-sql-part-ii/feed 2 3337
Generating Google line charts with SQL, part I https://shlomi-noach.github.io/blog/mysql/generating-google-line-charts-with-sql-part-i https://shlomi-noach.github.io/blog/mysql/generating-google-line-charts-with-sql-part-i#comments Tue, 01 Feb 2011 08:29:31 +0000 https://shlomi-noach.github.io/blog/?p=1468 In this series of posts I wish to show how Google Charts can be generated via SQL. We discuss the Google Charts limitations which must be challenged, and work towards a simple chart.

I’m going to present the algorithm I use in mycheckpoint, a MySQL monitoring utility, which generates Google charts by raw data using views. An example of such chart follows:


http://chart.apis.google.com/chart?cht=lc&chs=370x180&chts=303030,12&chtt=Latest+24+hours:+Nov+9,+05:50++-++Nov+10,+05:50&chf=c,s,ffffff&chdl=Rentals+rate:+custom_1_psec&chdlp=b&chco=ff8c00&chd=s:GDGKGFLFGMJHRLMPPNULJRPLTOPRUMYPPVRNbQUSUSbSNWUOfSWTObVSUVWSVYVPbTPjfTbRTdXReUWhcTQRQZbTWYVYPaVZXdYYWPTabYUTbW99QLgLNIOIRNNMIKRJEHGFHGJGGFIFDFGDK&chxt=x,y&chxr=1,0,8.720000&chxl=0:|+||08:00||+||12:00||+||16:00||+||20:00||+||00:00||+||04:00||&chxs=0,505050,10,0,lt&chg=4.17,25,1,2,0.69,0&chxp=0,0.69,4.86,9.03,13.20,17.37,21.54,25.71,29.88,34.05,38.22,42.39,46.56,50.73,54.90,59.07,63.24,67.41,71.58,75.75,79.92,84.09,88.26,92.43,96.60&tsstart=2010-11-09+05:50:00&tsstep=600

mycheckpoint does not actually call on Google to do the chart rendering, but invokes its own JavaScript code to visualize the URL locally.

Here are some downsides for using Google charts:

  • The URL cannot be as long as you like. 2048 characters is an upper bound you’ll want to keep behind. [Google charts POST method calls are available, which leads to 16K equivalent of URL length — this is still not too helpful due to the nature of POST calls]
  • Features are inconsistent. To specify label or tick positions, one must specify exact positions. To specify grid positions, one must supply with step, offset, etc. There are more such inconsistencies.
  • Google charts are not too friendly. Taking the ticks and grids example from above, there really shouldn’t be a reason why grids would not be automatically generated according to ticks definitions. But we are required to specify positions for the ticks as well as for the grids.
  • There is no support for time-series. One must translate time as x-axis values.
  • Perhaps most intimidating to many people: to generate a Google chart, once must send data to Google. Which is the main reason I used local JavaScript rendering.

Anyway, let’s build a very simple chart. Since I will not cover everything in this post, we make for some relaxed conditions.

Sample data

Consider the chart_data table, below:

SELECT * FROM chart_data;
+-----+---------------------+-------------------+
| id  | ts                  | data              |
+-----+---------------------+-------------------+
|   1 | 2011-02-01 09:59:15 |  1.99666833293656 |
|   2 | 2011-02-01 09:59:15 |  7.94677323180245 |
|   3 | 2011-02-01 09:59:15 |  17.7312123996804 |
|   4 | 2011-02-01 09:59:15 |   31.153467384692 |
|   5 | 2011-02-01 09:59:15 |  47.9425538604203 |
|   6 | 2011-02-01 09:59:15 |  67.7570968074042 |
|   7 | 2011-02-01 09:59:15 |  90.1904762132767 |
|   8 | 2011-02-01 09:59:15 |  114.776974543924 |
|   9 | 2011-02-01 09:59:15 |  140.998843732947 |
|  10 | 2011-02-01 09:59:15 |  168.294196961579 |
|  11 | 2011-02-01 09:59:15 |  196.065619213516 |
|  12 | 2011-02-01 09:59:15 |  223.689380632134 |
|  13 | 2011-02-01 09:59:15 |   250.52512820847 |
...

Find complete SQL script in google_charts.sql

In the above table we assume the timestamp values are evenly distributed. This is because of the aforementioned fact that Google Charts do not support time-series. So, if the given timnestamps are not evenly distributed, or maybe some values are skipped, we must manually compensate for that. In mycheckpoint I do that. In this blog post, we skip it.

Axis boundaries

By default, Google Charts expect data to lie within the range 0..100. We need to reset the scale to match our minimum/maximum values, and lay out our values within that range.

We will be using views in this solution, and here is the first view:

CREATE OR REPLACE VIEW
 chart_data_minmax
AS
 SELECT MIN(data) AS min_value, MAX(data) AS max_value
FROM
 chart_data
;

In our example we will just settle with the min/max values. In real worlds we may:

  • Choose 0 to be the lower value, even if the minimal value is positive
  • Choose to round min-max values to a nearest power of 10.

The axis boundaries go under the following parameters:

  • chxt=x,y (we hard code bottom axis, left axis)
  • chxr=1,min,max (with min, max as computed by the above)

The following query produces these params:

SELECT CONCAT('&chxt=x,y&chxr=1,', ROUND(min_value, 1), ',', ROUND(max_value, 1)) AS minmax FROM chart_data_minmax;
+---------------------------------+
| minmax                          |
+---------------------------------+
| &chxt=x,y&chxr=1,-4716.6,5340.0 |
+---------------------------------+

Data values

Next, we have some 244 rows of data. Data is double. Can we be certain this all fits within 2048 characters? What if we had more rows?

Alas, we must make a trade-off. We will sacrifice accuracy in favor of a short URL. We will use simple encoding for our data. This means just one character per value, encoded A-Za-z0-9. This means just 62 distinct values.

So we must translate each data value into the range 0..61, and then into a character. First step is:

SELECT (data - min_value)/(max_value - min_value) AS range_01 FROM chart_data, chart_data_minmax;
+---------------------+
| range_01            |
+---------------------+
|   0.469204617275163 |
|   0.469796281944551 |
|   0.470769223945374 |
|   0.472103901937297 |
|   0.473773369865649 |
|   0.475743682188048 |
|   0.477974405550919 |
...

The above translates the data into the 0..1 range. Next is to simply multiply by 61:

SELECT 61*(data - min_value)/(max_value - min_value) AS range_0_61 FROM chart_data, chart_data_minmax;
+-------------------+
| range_0_61        |
+-------------------+
|  28.6214816537849 |
|  28.6575731986176 |
|  28.7169226606678 |
|  28.7983380181751 |
|  28.9001755618046 |
|  29.0203646134709 |
|  29.1564387386061 |
|  29.3055730348683 |
|  29.4646269983728 |
|  29.6301923928522 |
...

To translate into a simple format character we parse a constant string:

SELECT
  SUBSTRING(
    'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789',
    1+61*(data - min_value)/(max_value - min_value),
    1
  ) AS c FROM chart_data, chart_data_minmax;
+------+
| c    |
+------+
| d    |
| d    |
| d    |
| d    |
| d    |
| d    |
| d    |
| d    |
| d    |
| e    |
| e    |
...

Oh, and we should handle NULLs, as well, so:

SELECT
  IF(
    data IS NULL,
    '_',
    SUBSTRING(
      'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789',
      1+61*(data - min_value)/(max_value - min_value),
      1
    )
  ) AS c FROM chart_data, chart_data_minmax;
+------+
| c    |
+------+
| d    |
| d    |
| d    |
| d    |
| d    |
| d    |
| d    |
| d    |
| d    |
| e    |
| e    |
...

And, to present it as a char sequence, we use the chd parameter and call GROUP_CONCAT to the rescue:

SELECT
  CONCAT('&chd=s:',
    GROUP_CONCAT(
      IF(
        data IS NULL,
        '_',
        SUBSTRING(
          'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789',
          1+61*(data - min_value)/(max_value - min_value),
          1
        )
      )
      SEPARATOR ''
    )
  ) AS c FROM chart_data, chart_data_minmax
\G
*************************** 1. row ***************************
c: &chd=s:dddddddddeeeeeefffffffffeeeedddcccbbaaZZZYYYXXXXXXXXXYYYZZabbcdeefghhijkkllmmmmmmmmllkkjihgfedcbZYXWVUTSRRQQPPPPQQQRSTUVWXZacdfgijlmnpqrssttuuuttssrqonmkigfdbZXVTSQONMLKJIIIIIIJKLMOPRTVXZbegilnprtvwyz01111110zyxvtrpnkifcaXUSPNLJHFECBBAAABBCEFHJLNQTWZcfilortwy1346789999876420yvspmjfcYVSOL

Wrap up

Combining the above two queries, with some URL necessities, we get:

SELECT
  CONCAT(
    'http://chart.apis.google.com/chart?cht=lc&chs=400x200&chtt=SQL%20chart&chxt=x,y&chxr=1,',
    ROUND(min_value, 1), ',',
    ROUND(max_value, 1),
    '&chd=s:',
    GROUP_CONCAT(
      IF(
        data IS NULL,
        '_',
        SUBSTRING(
          'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789',
          1+61*(data - min_value)/(max_value - min_value),
          1
        )
      )
      SEPARATOR ''
    )
  ) FROM chart_data, chart_data_minmax
;

And get:


http://chart.apis.google.com/chart?cht=lc&chs=400x200&chtt=SQL%20chart&chxt=x,y&chxr=1,-4716.6,5340.0&chd=s:dddddddddeeeeeefffffffffeeeedddcccbbaaZZZYYYXXXXXXXXXYYYZZabbcdeefghhijkkllmmmmmmmmllkkjihgfedcbZYXWVUTSRRQQPPPPQQQRSTUVWXZacdfgijlmnpqrssttuuuttssrqonmkigfdbZXVTSQONMLKJIIIIIIJKLMOPRTVXZbegilnprtvwyz01111110zyxvtrpnkifcaXUSPNLJHFECBBAAABBCEFHJLNQTWZcfilortwy1346789999876420yvspmjfcYVSOL

The above chart depicts our visualized data. Alas, x-axis ranges are wrong; no ticks nor grids are yet presented.

Granularity is not the best, either: the different 62 values are spread across a 200 pixel high chart, making for a 3 pixel distinction between two values, evident on the left area of the chart.

Continues on Generating Google line charts with SQL, part II.

]]>
https://shlomi-noach.github.io/blog/mysql/generating-google-line-charts-with-sql-part-i/feed 3 1468
mycheckpoint (rev. 170): improved custom queries; local charting; page/swap I/O monitoring; improved HTML reports https://shlomi-noach.github.io/blog/mysql/mycheckpoint-rev-170-improved-custom-queries-local-charting-pageswap-io-monitoring-improved-html-reports https://shlomi-noach.github.io/blog/mysql/mycheckpoint-rev-170-improved-custom-queries-local-charting-pageswap-io-monitoring-improved-html-reports#comments Fri, 16 Jul 2010 08:58:40 +0000 https://shlomi-noach.github.io/blog/?p=2650 Revision 170 of mycheckpoint, a MySQL monitoring solution, has been released. New and updated in this revision:

  • Improved custom queries: lifting of limitations from previous, introductory revision; better HTML presentation
  • Local, inline charting: no rendering of Google Charts, unless explicitly requested. All charts are now rendered locally using JavaScript. No data is now sent over the network.
  • Page/Swap I/O monitoring: now monitoring for page ins and outs, swap ins and outs (Linux only).
  • Improved HTML reports: several improvements on presentation (see sample, more follow).

Improved custom queries

Some limitations, introduced in revision 132, are now lifted. New features are introduced.

  • There is now no limit to the number of custom queries (well, an INT limit).
  • In fact, the data tables adjust themselves to the existing custom queries in the form of auto-deploy: once a new custom query is added or an old one removed, mycheckpoint will add or remove the relevant columns from the data tables.
  • The chart_order column is now utilized: HTML reports which include custom query charts now order those charts according to chart_order values. This makes for nicer reports.
  • The standard HTML brief report (SELECT html FROM sv_report_html_brief) now automatically includes all custom charts. The HTML brief report is the report one usually wants to look at: it provides with the latest 24 hours metrics for selected values. It now becomes a centralized place for all that is interesting in the past 24 hours.
  • Custom queries are now allowed to return NULL, treated as a missing value. This is a bugfix from previous revisions.

Local charting

Motivation for local charting is clear: no one likes having their data being sent over the network. And no one likes Google to know about their DML values.

I’ve been playing around with quite a few charting solutions, and have gone into depths with two of them, adding and rewriting quite a lot of code. Eventually, I settled on my very own rendering. Here’s what I’ve seen & tested:

  • dygraphs: a very nice time series charting library. I’ve presented a use case on a previous post.
    • Pros: slick, easy to work with.
    • Cons: uses HTML Canvas for rendering. This is fine on Firefox, Chrome, Safari, you name it. This isn’t fine on IE, which does not support Canvas. There’s ExplorerCanvas, a hack tool which converts canvas to IE’s VML, but it is far from being satisfactory: it is sloooow. Very, very slow. It is slow with one chart; but loading of 21 charts, as I do in some of mycheckpoint‘s reports can take long minutes on Internet explorer.
    • Cons: Only provides with a time series chart. No scatter plots.
  • Because they’re using ExplorerCanvas for IE, flot, jqPlot etc., are all unacceptable.
  • gRaphael: very slick charts based on Raphael. The original line charts are very basic, and I have invested a lot of time rewriting a great deal (you can find it all here). Raphael uses VML on IE, and SVG for all other browsers.
    • Pros: very slick. Supports various chart types, including line (though not time-series) and scatter.
    • Cons: slooooooooow when instantiating multiple charts. Unbearably slow, both on Firefox and IE. Slow as in minutes of waiting.

In addition, all of the above solutions were quite heavyweight: at about 45KB to start with, then add ExplorerCanvas or jQuery, or Raphael as supporting libraries, these became a real burden.

So, I had some time to spare (business is fine, thank you. I was a bit Ill. I’m feeling well now, thank you), and was upset what with all the time I invested in the above coding. And I decided to invest even more time, and build my own charts.

Enter openark-charts.



Currently, these line charts and scatter charts know how to parse a Google Image chart URL (only some features supported — only those I’m actually using with mycheckpoint). These are not full blown solutions: they come to serve mycheckpoint. And they do so nicely, if I may say so. Using Canvas for most browsers, or VML for IE, these very small pieces of code (10K for line chart, 6K for scatter chart, minified) load fast, use very little memory, and do their work well.

Granted, neither provides with interactive features: this is planned for the future.

Page/swap I/O monitoring

(Linux only) mycheckpoint now reads /proc/vmstat to get the pageins, pageouts, swapins and swapouts (since last reboot). I was actually looking at completely different places on the /proc file system to get swap info, and was frustrated with the complexity involved, till I bumped on /proc/vmstat… New tricks every day!

Improved HTML reports

This is mostly HTML make-up. Some minimal design, some more details thrown into the HTML pages (name of DB, MySQL version, mycheckpoint version). A little more verbosity; all sorts of stuff which was neglected so far.

Here are some show off examples of the new HTML views: [full report], [brief report], [24/7 report], [custom full report], [custom brief report], [alert pending report].

All HTML views now utilize the new openark-charts, and none renders charts with Google charts. This means when you use your HTML view, your data is safe. No data is sent over the net. All charts are rendered using Javascript, which is loaded and executed locally.

But if you like, there’s a [url] link next to each chart, which leads to a (online) Google chart image. Why? Because neither HTML Canvas nor VML allow for a complete rendering of the charts to an image. So this is a way for one to retrieve & store a chart’s image. Don’t use it if you see no reason for it; it’s just there.

And I even threw in rounded corners (IE users: only as of Windows 7).

Future plans

Work is going on. These are the non-scheduled future tasks I see:

  • Monitoring InnoDB Plugin & XtraDB status.
  • Interactive charts. See my earlier post.
  • A proper man page.
  • Anything else that interests me.

Try it out

Try out mycheckpoint. It’s a different kind of monitoring solution. You will need basic SQL skills, and in return you’ll get a lot of power under your hands.

mycheckpoint is released under the New BSD License.

]]>
https://shlomi-noach.github.io/blog/mysql/mycheckpoint-rev-170-improved-custom-queries-local-charting-pageswap-io-monitoring-improved-html-reports/feed 3 2650
mycheckpoint (rev. 132): custom monitoring, custom charts, process list dump https://shlomi-noach.github.io/blog/mysql/mycheckpoint-rev-132-custom-monitoring-custom-charts-process-list-dump https://shlomi-noach.github.io/blog/mysql/mycheckpoint-rev-132-custom-monitoring-custom-charts-process-list-dump#comments Fri, 04 Jun 2010 09:17:27 +0000 https://shlomi-noach.github.io/blog/?p=2463 Revision 132 of mycheckpoint has been released. New and updated in this revision:

  • Custom monitoring: monitoring & charting for user defined queries
  • HTML reports for custom monitoring
  • Process list dump upon alert notifications

Custom monitoring & charts

Custom monitoring allows the user to supply with a query, the results of which will be monitored.

That is, mycheckpoint monitors the status variables, replication status, OS metrics. But it cannot by itself monitor one’s application. Which is why a user may supply with such query as:

SELECT COUNT(*) FROM shopping_cart WHERE is_pending=1

Such a query will tell an online store how many customers are in the midst of shopping. There is no argument that this number is worth monitoring for. Given the above query, mycheckpoint will execute it per sample, and store the query’s result along with all sampled data, to be then aggregated by complex views to answer for:

  • What was the value per given sample?
  • What is the value difference for each sample?
  • What is the change per second, i.e. the rate?

mycheckpoint goes one step forward, and explicity records another metric:

  • How much time did it take to take that sample?

As another example, a query worth testing for rate:

SELECT MAX(shopping_cart_id) FROM shopping_cart

The above will provide with the last id. Assuming this is AUTO_INCREMENT, and assuming we’re on auto_increment_increment=1, two samples will allow us to get the number of created carts between those samples. Now, here’s a metric I’d like to read:

  • How many carts are created per second, for each hour of the day?

We get all these for free with mycheckpoint, which already does this analysis. All we need to provide is the query, and how we would like it to be visualized (visualization is optional, it is not the only way to diagnose monitored data) graphically:

INSERT INTO
 custom_query (custom_query_id, enabled, query_eval, description, chart_type, chart_order)
 VALUES (0, 1, 'SELECT COUNT(*) FROM store.shopping_cart WHERE is_pending=1', 'Number of pending carts', 'value', 0);

INSERT INTO
 custom_query (custom_query_id, enabled, query_eval, description, chart_type, chart_order)
 VALUES (1, 1, 'SELECT MAX(shopping_cart_id) FROM store.shopping_cart', 'Created carts rate', 'value_psec', 0);

We can later query for these values, just like we do for normal monitored values:

mysql> SELECT id, ts, created_tmp_tables_psec, custom_0, custom_1_psec FROM sv_sample WHERE ts >= NOW() - INTERVAL 1 HOUR;
+-------+---------------------+-------------------------+----------+---------------+
| id    | ts                  | created_tmp_tables_psec | custom_0 | custom_1_psec |
+-------+---------------------+-------------------------+----------+---------------+
| 50730 | 2010-05-21 19:05:01 |                   16.64 |      448 |          3.02 |
| 50731 | 2010-05-21 19:10:02 |                   20.97 |       89 |          1.73 |
| 50732 | 2010-05-21 19:15:01 |                   15.70 |      367 |          3.56 |
| 50733 | 2010-05-21 19:20:01 |                   18.32 |       54 |          1.43 |
| 50734 | 2010-05-21 19:25:01 |                   16.42 |       91 |          1.96 |
| 50735 | 2010-05-21 19:30:02 |                   21.93 |      233 |          2.11 |
| 50736 | 2010-05-21 19:35:02 |                   14.58 |      176 |          1.91 |
| 50737 | 2010-05-21 19:40:01 |                   21.61 |      168 |          1.93 |
| 50738 | 2010-05-21 19:45:01 |                   16.05 |      241 |          2.44 |
| 50739 | 2010-05-21 19:50:01 |                   19.70 |       46 |          1.19 |
| 50740 | 2010-05-21 19:55:01 |                   15.85 |      177 |          2.28 |
| 50741 | 2010-05-21 20:00:01 |                   19.04 |        8 |          0.82 |
+-------+---------------------+-------------------------+----------+---------------+

Of course, it is also possible to harness mycheckpoint‘s views power to generate charts:

mysql> SELECT custom_1_psec FROM sv_report_chart_sample\G

The rules are:

  • There can (currently) only be 18 custom queries.
  • The custom_query_id must range 0-17 (to be lifted soon).
  • A custom query must return with exactly one row, with exactly one column, which is a kind of integer.

Please read my earlier post on custom monitoring to get more background.

Custom monitoring HTML reports

Custom monitoring comes with a HTML reports, featuring requested charts. See a sample custom report.

In this sample report, a few queries are monitored for value (pending rentals, pending downloads) and a few for rates (downloads per second, emails per second etc.).

Custom HTML reports come in two flavors:

  • Brief reports, featuring last 24 hours, as in the example above. These are handled by the sv_custom_html_brief view.
  • Full reports, featuring last 24 hours, last 10 days, known history. These take longer to generate, and are handled by the sv_custom_html view.

The sample report was generated by issuing:

SELECT html FROM sv_custom_html_brief;

I won’t go into details here as for how this view generates the HTML code. There is a myriad of view dependencies, with many interesting tricks on the way. But do remember it’s just a view. You don’t need an application (not even mycheckpoint itself) to generate the report. All it takes is a query.

Processlist dump

When an alert notification fires (an email is prepared to inform on some alert condition), a processlist dump summary is taken and included in email report. It may be useful to understand why the slave is lagging, or exactly why there are so many active threads.

The dump summary presents the processlist much as you would see it on SHOW PROCESSLIST, but only lists the active threads, noting down how many sleeping processes there are (PS, thread & process are the same in the terminology of MySQL connections). An example dump looks like this:

PROCESSLIST summary:

     Id: 3
   User: system user
   Host:
     db: NULL
Command: Connect
   Time: 3168098
  State: Waiting for master to send event
   Info: NULL
-------

     Id: 4
   User: system user
   Host:
     db: prod_db
Command: Connect
   Time: 612
  State: Updating
   Info: UPDATE user SET is_offline = 1 WHERE id IN (50440010,50440011)
-------

     Id: 8916579
   User: prod_user
   Host: localhost
     db: prod_db
Command: Query
   Time: 1
  State: Sending data
   Info: INSERT IGNORE INTO archive.stat_archive (id, origin, path, ts, content
-------

     Id: 8916629
   User: mycheckpoint
   Host: localhost
     db: NULL
Command: Query
   Time: 0
  State: NULL
   Info: SHOW PROCESSLIST
-------
Sleeping: 3 processes

Future plans

Work is going on. These are the non-scheduled future tasks I see:

  • Monitoring InnoDB Plugin & XtraDB status.
  • Interactive charts. See my earlier post.
  • Monitoring for swap activity (Linux only).
  • Enhanced custom queries handling, including auto-deploy upon change of custom queries.
  • A proper man page.
  • Anything else that interests me.

Try it out

Try out mycheckpoint. It’s a different kind of monitoring solution. You will need basic SQL skills, and in return you’ll get a lot of power under your hands.

mycheckpoint is released under the New BSD License.

]]>
https://shlomi-noach.github.io/blog/mysql/mycheckpoint-rev-132-custom-monitoring-custom-charts-process-list-dump/feed 1 2463
Static charts vs. interactive charts https://shlomi-noach.github.io/blog/mysql/static-charts-vs-interactive-charts https://shlomi-noach.github.io/blog/mysql/static-charts-vs-interactive-charts#comments Tue, 02 Mar 2010 13:28:08 +0000 https://shlomi-noach.github.io/blog/?p=2027 I’m having my usual fun with charts. Working on mycheckpoint, I’ve generated monitoring charts using the Google Chars API. But I’ve also had chance to experiment and deploy interactive charts, JavaScript based. In particular, I used and tweaked dygraphs.

I’d like to note some differences in using charts of both kinds. And I think it makes a very big difference.

Static charts

I’ll call any image-based chart by “static chart”. It’s just a static image. Example of such charts are those generated by Google Image Charts (they now also have new, interactive charts), or RRDtool. Show below is an example of a static chart; in this example, generated by Google:

Pros and cons of static charts

Pros

  • Images can be viewed on any graphical platform. Browsers, email clients, cell phones, whatever.
  • Self contained: chart image, legend, scales: all in one image.
  • As such, easy to move around.
  • Are safe to use.

Cons

  • Images are fuzzy. Is the com_replace_psec really 0? Maybe it’s 0.1? A larger value can make lower values hard to tell.
  • Images are inaccurate: the colors can lie. The red and green lines showing are hard to tell apart. The red is painted above the green. Data gets “lost”.
  • They do not zoom (one needs to regenerate larger image)
  • Unless encoded with base64, HTML pages which include images need to link outside.
  • In the particular case of Google Charts, one is limited to 2K length URL. Trust me, it’s a big limitation! (PS, Google now support POST method to allow for up to 16K. But… it’s a POST method…)
  • In the particular case of Google Charts, one must have an internet connection.
  • In the particular case of Google Charts, one must submit data to Google.

Interactive charts

Interactive charts are those which react to your commands. These are either JavaScript or Flash based, mostly. They allow for really nice features. Take the following chart as an example: try and move over with your mouse; or select sections to zoom in.


DML

[graphDiv]
[labelsDiv]

The above chart is generated with dygraphs. Since it is embedded within my WordPress page, the layout is affected by that of my theme. Take a look at this example page to see similar charts outside this blog site (Internet Explorer users: Maxmimize/minimize button will not work well for now. And, may I suggest Mozilla Firefox?)

Pros and cons of interactive charts

Pros

  • Can present you with exact values. No more doubt about the com_replace_psec values.
  • Can allow for zoom in, zoom out.

Cons

  • Need supporting platform. The above cannot be viewed by non-JavaScript browsers (cell phones, etc.)
  • Browser support is also an issue with JavaScript.
  • Emailing such report will result in mail blocking in many companies: mail filters will not allow for JavaScript code to pass.
  • Charts are not necessarily self-contained, in terms of the chart entity With Flash charts (e.g. Fusion Charts) this works. But in the above, the legend and scales are outside the image. As such, they cannot be just moved around.
  • HTML pages which include such charts can be self contained. The HTML page can include all the JavaScript dependencies, in addition to the chart generating code. Flash based charts cannot be self contained.

Summary

Interactive charts are cool!

I’m now integrating dygraphs into mycheckpoint (How nice it is to work with BSD & MIT licenses!). Though I may later switch to flot, interactive charts will be the next standard charting way in mycheckpoint. I will continue supporting static Google Charts, as follows from the above pros and cons list.

]]>
https://shlomi-noach.github.io/blog/mysql/static-charts-vs-interactive-charts/feed 7 2027
mycheckpoint (rev. 88): mount points monitoring, improved charting, enhanced auto-deploy https://shlomi-noach.github.io/blog/mysql/mycheckpoint-rev-88-mount-points-monitoring-improved-charting-enhanced-auto-deploy https://shlomi-noach.github.io/blog/mysql/mycheckpoint-rev-88-mount-points-monitoring-improved-charting-enhanced-auto-deploy#respond Wed, 10 Feb 2010 06:29:47 +0000 https://shlomi-noach.github.io/blog/?p=1904 Revision #88 of mycheckpoint is released. In this revision:

  • Disk space monitoring
  • Improved charting
  • Enhanced auto-deploy
  • And more…

Disk space monitoring

mycheckpoint now monitors (on Linux only) three mount points:

  1. The “/” (root) mount point
  2. The datadir mount point
  3. The tmpdir mount point

It may well be the case that two of the above (or perhaps all three of them) share the same mount point. For example, if there isn’t any particular partition for “/tmp“, it is possible that the tmpdir (by default “/tmp“) is on the same mount point as “/“. mycheckpoint does not care.

mycheckpoint monitors and reports the mount point’s used percent, in a similar algorithm df uses.

Disk space monitoring is only possible when monitoring the local machine (i.e. mycheckpoint runs on the same machine as the monitored MySQL server). In the future mycheckpoint may also monitor additional mount points, such as the various logs mount points.

Improved charting

There has been some extensive work to turn the charts into real time-series based. Google charts does not support time series charts; when it will, the required URL length would probably be too long to be used. Some SQL tweaks made it possible to display the charts in correct time-scale even if sampling is taken on non constant interval (or fail to be taken for long periods).

For more examples see the link for HTML brief reports sample, below.

I will write more on SQL Google charts generation in the future.

Enhanced auto-deploy

mycheckpoint will now detect changes to the MySQL version, in addition to changes in mycheckpoint‘s version itself. This means there’s no need in ever worrying about upgrades to either one of these components. Just use mycheckpoint to take another sample; it will auto-detect if the MySQL version is different, and start sampling all those new variables introduced in the new version (or stop sampling variables no longer used). It works both for MySQL upgrades and downgrades.

Enhanced localhost detection

To determine whether it is monitoring the local host, mycheckpoint now considers the hostname for the monitored server, and sees if it is either ‘127.0.0.1’, ‘localhost’, or the machine’s hostname or fully qualified hostname.domainname (these last two additions apply for Unix based machines, and have only been tested on Linux so far).

HTML brief reports

Getting a full HTML report is time consuming. I’ve had requests (though not officially submitted through the Issues mechanism) to make it shorter. This is as yet a difficult job. There’s just too much data to aggregate (up to ~180 days of every-5-minute-samples, in a common scenario).

HTML brief reports were introduced in previous versions, and have now been enhanced to include more data. These only present last 24 hours data, and load fast. See HTML brief report sample.

Get it

Downloads are available on Google code’s mycheckpoint page. Documentation can be found on the mycheckpoint home page.

On the press

Not so new by now (it’s two months old), I’m very happy that mycheckpoint has been noted by Jeremy Zawodny in his “My Top Resources of 2009” column on Linux Magazine.

Future plans

Immediate plans for mycheckpoint are:

  • Email alerts notifications; this will allow mycheckpoint to become a real monitoring solution. Following the concept of “SQL oritented monitoring“, these will be SQL based as well.
  • Custom monitoring: allowing user defined queries to be recorded by mycheckpoint; these can then participate in alerts monitoring. This will allow for easy email notifications on program-level errors.
]]>
https://shlomi-noach.github.io/blog/mysql/mycheckpoint-rev-88-mount-points-monitoring-improved-charting-enhanced-auto-deploy/feed 0 1904
mycheckpoint rev. 76: OS monitoring, auto deploy, brief HTML and 24/7 reports https://shlomi-noach.github.io/blog/mysql/mycheckpoint-rev-76-os-monitoring-auto-deploy-brief-html-and-247-reports https://shlomi-noach.github.io/blog/mysql/mycheckpoint-rev-76-os-monitoring-auto-deploy-brief-html-and-247-reports#comments Tue, 05 Jan 2010 08:55:14 +0000 https://shlomi-noach.github.io/blog/?p=1784 Revision 76 of mycheckpoint comes with quite a few improvements, including:

  • OS monitoring (CPU, load average, memory)
  • Auto-deploy
  • Improved charting
  • Brief HTML reports
  • 24/7 charts

OS Monitoring

When monitoring the local machine, mycheckpoint now monitors CPU utilization, load average, memory and swap space.

This only applies to the Linux operating system; there is currently no plan to work this out for other operating systems.

Examples:

mysql> SELECT os_cpu_utilization_percent FROM sv_report_chart_sample;

mycheckpoint-chart-cpu-sample
mysql> SELECT ts, os_loadavg FROM mycheckpoint.sv_report_sample;
+---------------------+------------+
| 2009-12-27 11:45:01 |       1.78 |
| 2009-12-27 11:50:01 |       2.48 |
| 2009-12-27 11:55:01 |       2.35 |
...
+---------------------+------------+
mysql> SELECT report FROM mycheckpoint.sv_report_human_sample ORDER BY id DESC LIMIT 1 \G
*************************** 1. row ***************************
report:
Report period: 2009-12-27 13:20:01 to 2009-12-27 13:25:01. Period is 5 minutes (0.08 hours)
Uptime: 100.0% (Up: 334 days, 06:37:28 hours)

OS:
 Load average: 1.67
 CPU utilization: 25.2%
 Memory: 7486.4MB used out of 7985.6484MB (Active: 6685.8906MB)
 Swap: 3835.2MB used out of 8189.3750MB
...

Auto-deploy

mycheckpoint now has a version recognition mechanism. There is no need to call mycheckpoint with the “deploy” argument on first install or after upgrade. mycheckpoint will recognize a change of version and will auto-deploy before moving on to monitoring your system.

It is still possible, though, to use “deploy“, in case you just want to make sure an upgrade takes place, without issuing a monitoring action.

Improved charting

Further improvements and bug fixes made to the Google charts, including the implementation of missing values charting.

Brief HTML report

In contrast with the full blown HTML report (see sample), which presents hourly/daily/weekly reports for the many metrics, the new brief report only presents with a few hourly based charts. These include InnoDB performance, DML, OS metrics, and replication status.

To get a brief HTML report, issue:

mysql> SELECT html FROM sv_report_html_brief;


See sample brief HTML report.

24/7 charts

24/7 charts present the various metrics on a 24×7 matrix, which allows for diagnostics of usage throughout the day and week. For example, it makes it easier to see how things slow down on Saturday/Sunday; how load increases on 10:00am every day, etc.

24/7 charts are provided by the sv_report_chart_24_7 view.

DESC sv_report_chart_24_7;
+---------------------------------------+----------+------+-----+---------+-------+
| Field                                 | Type     | Null | Key | Default | Extra |
+---------------------------------------+----------+------+-----+---------+-------+
| innodb_read_hit_percent               | longblob | YES  |     | NULL    |       |
| innodb_buffer_pool_reads_psec         | longblob | YES  |     | NULL    |       |
| innodb_buffer_pool_pages_flushed_psec | longblob | YES  |     | NULL    |       |
| innodb_os_log_written_psec            | longblob | YES  |     | NULL    |       |
| innodb_row_lock_waits_psec            | longblob | YES  |     | NULL    |       |
| mega_bytes_sent_psec                  | longblob | YES  |     | NULL    |       |
| mega_bytes_received_psec              | longblob | YES  |     | NULL    |       |
| key_read_hit_percent                  | longblob | YES  |     | NULL    |       |
| key_write_hit_percent                 | longblob | YES  |     | NULL    |       |
| com_select_psec                       | longblob | YES  |     | NULL    |       |
| com_insert_psec                       | longblob | YES  |     | NULL    |       |
| com_delete_psec                       | longblob | YES  |     | NULL    |       |
| com_update_psec                       | longblob | YES  |     | NULL    |       |
| com_replace_psec                      | longblob | YES  |     | NULL    |       |
| com_set_option_percent                | longblob | YES  |     | NULL    |       |
| com_commit_percent                    | longblob | YES  |     | NULL    |       |
| slow_queries_percent                  | longblob | YES  |     | NULL    |       |
| select_scan_psec                      | longblob | YES  |     | NULL    |       |
| select_full_join_psec                 | longblob | YES  |     | NULL    |       |
| select_range_psec                     | longblob | YES  |     | NULL    |       |
| table_locks_waited_psec               | longblob | YES  |     | NULL    |       |
| opened_tables_psec                    | longblob | YES  |     | NULL    |       |
| created_tmp_tables_psec               | longblob | YES  |     | NULL    |       |
| created_tmp_disk_tables_psec          | longblob | YES  |     | NULL    |       |
| connections_psec                      | longblob | YES  |     | NULL    |       |
| aborted_connects_psec                 | longblob | YES  |     | NULL    |       |
| threads_created_psec                  | longblob | YES  |     | NULL    |       |
| seconds_behind_master                 | longblob | YES  |     | NULL    |       |
| os_loadavg                            | longblob | YES  |     | NULL    |       |
| os_cpu_utilization_percent            | longblob | YES  |     | NULL    |       |
| os_mem_used_mb                        | longblob | YES  |     | NULL    |       |
| os_mem_active_mb                      | longblob | YES  |     | NULL    |       |
| os_swap_used_mb                       | longblob | YES  |     | NULL    |       |
+---------------------------------------+----------+------+-----+---------+-------

Example:

mysql> SELECT com_select_psec, innodb_buffer_pool_pages_flushed_psec FROM mycheckpoint.sv_report_chart_24_7 \G
mycheckpoint-chart-247-sample

Trying mycheckpoint

Future plans

I haven’t got any major immediate issues; planning on user customization of charts and HTML reports. Considering thresholds and alerting for the future.

]]>
https://shlomi-noach.github.io/blog/mysql/mycheckpoint-rev-76-os-monitoring-auto-deploy-brief-html-and-247-reports/feed 2 1784
New and noteworthy in mycheckpoint (rev. 57) https://shlomi-noach.github.io/blog/mysql/new-and-noteworthy-in-mycheckpoint-rev-57 https://shlomi-noach.github.io/blog/mysql/new-and-noteworthy-in-mycheckpoint-rev-57#comments Wed, 16 Dec 2009 19:12:07 +0000 https://shlomi-noach.github.io/blog/?p=1695 Rev. 57 of mycheckpoint has been released and is available for download.

New and updated in this revision:

Remote host monitoring

It is now possible to monitor one host, while writing into another. Either or both could be remote hosts:

mycheckpoint --host=localhost --monitored-host=192.168.10.178

The above monitors the MySQL server on 192.168.10.178, and writes down to localhost (to be queried later)

mycheckpoint --monitored-host=127.0.0.1 --host=192.168.10.178

The above monitors the MySQL server on 127.0.0.1, and writes down to 192.168.10.178.

As result of the above addition, binary logging for monitoring statements are now enabled by default. The previous solution, in which binary logging were disabled by default and the same schema was utilized by different servers was far from being a clean solution. [Read more on Remote & multiple hosts monitoring]

Improved charting

The google charts reports have been improved significantly. Here’s how:

BeforeAfter
Before on left (or above), After on right (or below)

And

BeforeAfter
Before on left (or above), After on right (or below)

Recap: the charts are not stored anywhere; they are being calculated and generated by SQL queries & views, based on the raw data. For example, to generate the above, I do:

SELECT innodb_estimated_log_mb_written_per_hour FROM sv_report_chart_sample\G
*************************** 1. row ***************************
innodb_estimated_log_mb_written_per_hour: http://chart.apis.google.com/chart?cht=lc&chs=400x200&chts=303030,12&chtt=Latest+24+hours:+Dec+8,+08:20++-++Dec+9,+08:20&chdl=innodb_estimated_log_mb_written_per_hour&chdlp=b&chco=ff8c00&chd=s:OQOTSPPQPSROUQMTRSQTPSURQTQQPSOSRTRRUTRRTSRRRSRSTRRQSURSUVRRXQTVRUTTVUTOYTXSYTTSYUSUTWUPRRPQSQRQTPPSPQMPPQOQMQPMOMOLQORNNPNRNNOPQOOMQQPNRNMNORQMSTQQPTPRUQVQTTVSURUVVVSSSVWQVUSTVSWSSUURWRVRTWdotpQNROPQRPQQMPLRO9PWNRPNNNPXUMLNNPQQPSONPLLNWXZTQROSQMOONQPLJNOOQQKMPSMMPLfPSSSRTUQSORSSRSPNRSQSQ&chxt=x,y&chxr=1,0,2708.0&chxl=0:|09:00||||13:00||||17:00||||21:00||||01:00||||05:00||||&chxs=0,505050,10&chg=4.17,25,1,3,2.78,0&chxp=0,2.78,6.95,11.12,15.29,19.46,23.63,27.80,31.97,36.14,40.31,44.48,48.65,52.82,56.99,61.16,65.33,69.50,73.67,77.84,82.01,86.18,90.35,94.52,98.69

If you thought I was going over the edge in Auto scaling, scaled SQL graphs concluded or SQL pie chart, allow me to assure you: generation of the above charts using SQL & views is SQL blasphemy. While cool, the Google Charts API does not do too much on its own, and expects the caller to do the math for putting in the grids, scales and labels (and though all of them are tightly related, they must all be explicitly specified). I’ll write a later post on this.

Flexible charting

A new table, called charts_api, is introduced. It contains one row, which lists charting configuration.

SELECT * FROM charts_api;
+-------------+--------------+----------------------------------------------------------------+------------------------------------+
| chart_width | chart_height | simple_encoding                                                | service_url                        |
+-------------+--------------+----------------------------------------------------------------+------------------------------------+
|         400 |          200 | ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789 | http://chart.apis.google.com/chart |
+-------------+--------------+----------------------------------------------------------------+------------------------------------+

To change generated charts dimensions, simply update chart_width or chart_height.

To use other chart APIs, update service_url. I’ve tried out Eastwood Charts, a Java-based service, which implements the Google Charts API. There’s still some stuff missing, such as the positioning of labels and grids; and it’s slightly less polished (but configurable). But if you’re concerned about sending requests to Google, it’s a viable alternative.

So, drop the WAR file into your Tomcat / JBoss / other j2ee server, then follow http://127.0.0.1/eastwood/chart?

Fix to questions vs. queries issues

mycheckpoint now correctly identifies whether to use questions or queries when analyzing DML queries. If you got some 570% for com_update out of total queries and wondered about it, it should now be resolved to normal values.

On upgrading your current installation

  1. Download & install via debian or python installers
  2. Issue mycheckpoint once with “deploy” as argument
  3. Done

Future plans

I’m working on 24×7 charting; minimal operating system monitoring; auto-detection and auto-deploy on version upgrade; and more.

Stay tuned!

]]>
https://shlomi-noach.github.io/blog/mysql/new-and-noteworthy-in-mycheckpoint-rev-57/feed 1 1695
SQL multi line chart https://shlomi-noach.github.io/blog/mysql/sql-multi-line-chart https://shlomi-noach.github.io/blog/mysql/sql-multi-line-chart#comments Tue, 03 Nov 2009 10:44:13 +0000 https://shlomi-noach.github.io/blog/?p=1400 Time for another charting SQL query. I wish to present a single-query generated multi-line/area chart. I’ll walk through some of the steps towards making this happen. By the end of this post I’ll present some real-data charts, area charts and colored charts.

+---------+-----------------------------------------------------------------------------------+
| y_scale | chart                                                                             |
+---------+-----------------------------------------------------------------------------------+
| 1       | ****---------#######----------------------------------------*******--------###### |
| 0.88    | ----***---###-------##-----------------------------------***-------***---##------ |
| 0.75    | -------**#------------##-------------------------------**-------------*##-------- |
| 0.63    | ------##-*--------------##----------------------------*--------------##**-------- |
| 0.5     | -@@@@@@@@@@@@@@@----------#-----@@@@@@@@@@@@@@@@----**---------@@@@@@@@@@@@@@@@-- |
| 0.38    | ----#-------*--------------##----------------------*---------------#-------*----- |
| 0.25    | --##---------*---------------#--------------------*--------------##---------*---- |
| 0.13    | -#------------**--------------#-----------------**--------------#------------*--- |
| 0       | @---------------*--------------##--------------*---------------#--------------**- |
| -0.12   | -----------------*---------------#------------*--------------##-----------------* |
| -0.25   | ------------------*---------------#---------**--------------#-------------------- |
| -0.37   | -------------------**--------------#-------*---------------#--------------------- |
| -0.5    | ----------------@@@@@@@@@@@@@@@@----##----*-----@@@@@@@@@@@@@@@----------------@@ |
| -0.62   | ----------------------**--------------#-**--------------#------------------------ |
| -0.75   | ------------------------**------------**#-------------##------------------------- |
| -0.87   | --------------------------**-------***---###-------###--------------------------- |
| -1      | ----------------------------*******---------#######------------------------------ |
|         | v:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::v |
|         | 0                                                                               8 |
|         |     # sin(x)                                                                      |
|         |     * cos(x)                                                                      |
|         |     @ sign(tan(x))/2                                                              |
+---------+-----------------------------------------------------------------------------------+

Requirements

We need a generic query, which returns at least these two columns: ordering_column and row_values, The latter being a comma-delimited list of values. For example, the following query will do:

SELECT
 value/10 AS ordering_column,
 CONCAT(SIN(value/10), ',', COS(value/10), ',', SIGN(TAN(value/10))/2) AS row_values,
 @multi_line_chart_values_legend := 'sin(x),cos(x),sign(tan(x))/2' AS legend
FROM tinyint_asc
LIMIT 81;
+-----------------+---------------------------------------------+------------------------------+
| ordering_column | row_values                                  | legend                       |
+-----------------+---------------------------------------------+------------------------------+
|          0.0000 | 0,1,0.0000                                  | sin(x),cos(x),sign(tan(x))/2 |
|          0.1000 | 0.099833416646828,0.99500416527803,0.5000   | sin(x),cos(x),sign(tan(x))/2 |
|          0.2000 | 0.19866933079506,0.98006657784124,0.5000    | sin(x),cos(x),sign(tan(x))/2 |
|          0.3000 | 0.29552020666134,0.95533648912561,0.5000    | sin(x),cos(x),sign(tan(x))/2 |
|          0.4000 | 0.38941834230865,0.92106099400289,0.5000    | sin(x),cos(x),sign(tan(x))/2 |
|          0.5000 | 0.4794255386042,0.87758256189037,0.5000     | sin(x),cos(x),sign(tan(x))/2 |
|          0.6000 | 0.56464247339504,0.82533561490968,0.5000    | sin(x),cos(x),sign(tan(x))/2 |
...

Don’t worry about the @multi_line_chart_values_legend variable; it will be used later on for presenting chart legend.

Since I’ve already presented with how to create line charts in Auto scaling, scaled SQL graphs concluded, we will concentrate on how to present a single chart column, which includes multiple values.

Step 1

Let’s say we wish to present the values 12,5,16. We wish to eventually generate something like:

+------------------+
| unwalked_bar     |
+------------------+
| ----Y------X---Z |
+------------------+

Where X is in position 12, Y in position 5, and Z in position 16.

We begin with having these values concatenated into one string:

SET @values := '12,5,16';
SET @num_values := CHAR_LENGTH(@values)-CHAR_LENGTH(REPLACE(@values,',',''))+1;
SELECT @num_values;
+-------------+
| @num_values |
+-------------+
| 3           |
+-------------+

The query above parses the string and correctly identified there are three tokens.

Step 2

We now move on to tokenizing the text:

SELECT
 *,
 SUBSTRING_INDEX(SUBSTRING_INDEX(@row_values, ',', tinyint_asc.value), ',', -1) AS row_value
FROM
 tinyint_asc,
 (SELECT @row_values := '12,5,16' AS row_values) AS sel_values,
 (SELECT @num_values := CHAR_LENGTH(@row_values)-CHAR_LENGTH(REPLACE(@row_values,',',''))+1 AS num_values) AS sel_num_values
WHERE
 tinyint_asc.value BETWEEN 1 AND @num_values
;
+-------+------------+------------+-----------+
| value | row_values | num_values | row_value |
+-------+------------+------------+-----------+
|     1 | 12,5,16    |          3 | 12        |
|     2 | 12,5,16    |          3 | 5         |
|     3 | 12,5,16    |          3 | 16        |
+-------+------------+------------+-----------+

Using a numbers table (tinyint_asc), we walk the string and tokenize it. We also maintain an indicator per value.

Step 3

We need to recognize the minimum and maximum values:

SELECT
  value AS row_value_indicator,
  row_value,
  @min_row_value := LEAST(IFNULL(@min_row_value, row_value), row_value) AS min_row_value,
  @max_row_value := GREATEST(IFNULL(@max_row_value, row_value), row_value) AS max_row_value
FROM (
  SELECT
  *,
  SUBSTRING_INDEX(SUBSTRING_INDEX(@row_values, ',', tinyint_asc.value), ',', -1)+0 AS row_value
    FROM
      tinyint_asc,
      (SELECT @row_values := '12,5,16' AS row_values) AS sel_values,
      (SELECT @num_values := CHAR_LENGTH(@row_values)-CHAR_LENGTH(REPLACE(@row_values,',',''))+1 AS num_values) AS sel_num_values,
      (SELECT @min_row_value := NULL) AS sel_min_row_value,
      (SELECT @max_row_value := NULL) AS sel_max_row_value
    WHERE
      tinyint_asc.value BETWEEN 1 AND @num_values
  ) sel_row_values
;
+---------------------+-----------+---------------+---------------+
| row_value_indicator | row_value | min_row_value | max_row_value |
+---------------------+-----------+---------------+---------------+
|                   1 |        12 |            12 |            12 |
|                   2 |         5 |             5 |            12 |
|                   3 |        16 |             5 |            16 |
+---------------------+-----------+---------------+---------------+

The last line presents the correct values: 5 and 16 being min, max values respectively.

Step 4

We now iterate from min value to max value, this being the graph range, and, per row value (token in row_values column), we indicate whether there’s a pixel on the graph.

SELECT
  * ,
  IF(tinyint_asc.value = row_value, row_value_indicator, '-') AS display_val
FROM
  tinyint_asc,
  (SELECT
    value AS row_value_indicator,
    row_value,
    @min_row_value := LEAST(IFNULL(@min_row_value, row_value), row_value) AS min_row_value,
    @max_row_value := GREATEST(IFNULL(@max_row_value, row_value), row_value) AS max_row_value
    FROM (
    SELECT
      *,
      SUBSTRING_INDEX(SUBSTRING_INDEX(@row_values, ',', tinyint_asc.value), ',', -1)+0 AS row_value
    FROM
      tinyint_asc,
      (SELECT @row_values := '12,5,16' AS row_values) AS sel_values,
      (SELECT @num_values := CHAR_LENGTH(@row_values)-CHAR_LENGTH(REPLACE(@row_values,',',''))+1 AS num_values) AS sel_num_values,
      (SELECT @min_row_value := NULL) AS sel_min_row_value,
      (SELECT @max_row_value := NULL) AS sel_max_row_value
    WHERE
      tinyint_asc.value BETWEEN 1 AND @num_values
    ) sel_row_values
  ) AS sel_row_values_indicators
WHERE
  tinyint_asc.value BETWEEN 1 AND @max_row_value
;
+-------+---------------------+-----------+---------------+---------------+-------------+
| value | row_value_indicator | row_value | min_row_value | max_row_value | display_val |
+-------+---------------------+-----------+---------------+---------------+-------------+
|     1 |                   1 |        12 |            12 |            12 | -           |
|     1 |                   2 |         5 |             5 |            12 | -           |
|     1 |                   3 |        16 |             5 |            16 | -           |
|     2 |                   1 |        12 |            12 |            12 | -           |
|     2 |                   2 |         5 |             5 |            12 | -           |
|     2 |                   3 |        16 |             5 |            16 | -           |
|     3 |                   1 |        12 |            12 |            12 | -           |
|     3 |                   2 |         5 |             5 |            12 | -           |
|     3 |                   3 |        16 |             5 |            16 | -           |
|     4 |                   1 |        12 |            12 |            12 | -           |
|     4 |                   2 |         5 |             5 |            12 | -           |
|     4 |                   3 |        16 |             5 |            16 | -           |
|     5 |                   1 |        12 |            12 |            12 | -           |
|     5 |                   2 |         5 |             5 |            12 | 2           |
|     5 |                   3 |        16 |             5 |            16 | -           |
|     6 |                   1 |        12 |            12 |            12 | -           |
|     6 |                   2 |         5 |             5 |            12 | -           |
|     6 |                   3 |        16 |             5 |            16 | -           |

This table can get very long.

Step 5

We now group the marks per graph-row value. If there’s no mark, we present with an empty space. If only one mark is present, we display that mark. If two lines collide on that point, we need to decide which mark to use. In the following example, we choose by order of appearance in the row_values column. Another way (used later on) is to choose the higher value.

SELECT
  string_position,
  LEFT(GROUP_CONCAT(bar_string_token ORDER BY bar_string_token DESC separator ''), 1) AS chosen_mark
FROM
  (SELECT
    value AS string_position,
    IF(tinyint_asc.value = row_value, row_value_indicator, ' ') AS bar_string_token
  FROM
    tinyint_asc,
    (SELECT
      value AS row_value_indicator,
      row_value,
      @min_row_value := LEAST(IFNULL(@min_row_value, row_value), row_value) AS min_row_value,
      @max_row_value := GREATEST(IFNULL(@max_row_value, row_value), row_value) AS max_row_value
      FROM (
      SELECT
        *,
        SUBSTRING_INDEX(SUBSTRING_INDEX(@row_values, ',', tinyint_asc.value), ',', -1)+0 AS row_value
      FROM
        tinyint_asc,
        (SELECT @row_values := '12,5,16' AS row_values) AS sel_values,
        (SELECT @num_values := CHAR_LENGTH(@row_values)-CHAR_LENGTH(REPLACE(@row_values,',',''))+1 AS num_values) AS sel_num_values,
        (SELECT @min_row_value := NULL) AS sel_min_row_value,
        (SELECT @max_row_value := NULL) AS sel_max_row_value
      WHERE
        tinyint_asc.value BETWEEN 1 AND @num_values
      ) sel_row_values
    ) AS sel_row_values_indicators
  WHERE
    tinyint_asc.value BETWEEN 1 AND @max_row_value
  ) AS sel_marked_row_values
GROUP BY
  string_position
;
+-----------------+-------------+
| string_position | chosen_mark |
+-----------------+-------------+
|               1 |             |
|               2 |             |
|               3 |             |
|               4 |             |
|               5 | 2           |
|               6 |             |
|               7 |             |
|               8 |             |
|               9 |             |
|              10 |             |
|              11 |             |
|              12 | 1           |
|              13 |             |
|              14 |             |
|              15 |             |
|              16 | 3           |
+-----------------+-------------+

Step 6

It’s now just a matter of using GROUP_CONCAT to turn this into the following horizontal bar:

+------------------+
| unwalked_bar     |
+------------------+
| ----2------1---3 |
+------------------+

(I know I’m mixing spaces and dashes, it’s just for display purposes).

Putting it all together

From this point on, we use the techniques shown on Auto scaling, scaled SQL graphs concluded to generate the complete chart. For the curious, here’s what the complete query looks like. The text in bold is the only thing that needs to change; replace this with your own query.

SELECT
  y_scale,
  horizontal_bar as chart
  FROM
  (
  SELECT
    @multi_line_chart_row_number := @multi_line_chart_row_number+1,
    CASE @multi_line_chart_row_number
      WHEN 1 THEN ROUND(@multi_line_chart_max_value, @multi_line_chart_value_precision)
      WHEN @multi_line_chart_graph_rows THEN ROUND(@multi_line_chart_min_value, @multi_line_chart_value_precision)
      ELSE ROUND(@multi_line_chart_max_value-(@multi_line_chart_max_value-@multi_line_chart_min_value)*(@multi_line_chart_row_number-1)/(@multi_line_chart_graph_rows-1), @multi_line_chart_value_precision)
    END AS y_scale,
    horizontal_bar,
    @multi_line_chart_bar_length := IFNULL(@multi_line_chart_bar_length, CHAR_LENGTH(horizontal_bar))
  FROM
    (SELECT @multi_line_chart_row_number := 0) AS select_row
    INNER JOIN
    (
    SELECT
      GROUP_CONCAT(SUBSTRING(unwalked_bar, tinyint_asc.value, 1) ORDER BY ordering_column SEPARATOR '') AS horizontal_bar
    FROM
      tinyint_asc
    INNER JOIN (
      SELECT
        ordering_column,
        GROUP_CONCAT(bar_string_token ORDER BY string_position SEPARATOR '') AS unwalked_bar
      FROM
        (SELECT
          ordering_column,
          string_position,
          scaled_string_position,
          REPLACE(LEFT(GROUP_CONCAT(bar_string_token ORDER BY bar_string_token DESC SEPARATOR ''), 1), ' ', '-') AS bar_string_token
        FROM
          (SELECT
            ordering_column,
            @multi_line_chart_scaled_string_position := CONVERT((row_value-@multi_line_chart_min_value)*(@multi_line_chart_graph_rows-1)/(@multi_line_chart_max_value-@multi_line_chart_min_value), UNSIGNED) AS scaled_string_position,
            value AS string_position,
            IF(tinyint_asc.value = @multi_line_chart_scaled_string_position+1, SUBSTRING(IF(@multi_line_chart_values_legend IS NULL, @multi_line_chart_graph_fallback_colors, @multi_line_chart_graph_colors), row_value_indicator, 1), ' ') AS bar_string_token
          FROM
            tinyint_asc,
            (SELECT
              ordering_column,
              value AS row_value_indicator,
              row_value
            FROM (
              SELECT
                *,
                @multi_line_chart_min_value := LEAST(IFNULL(@multi_line_chart_min_value, row_value), row_value) AS min_value,
                @multi_line_chart_max_value := GREATEST(IFNULL(@multi_line_chart_max_value, row_value), row_value) AS max_value,
                @multi_line_chart_min_range := LEAST(IFNULL(@multi_line_chart_min_range, ordering_column), ordering_column) AS min_range,
                @multi_line_chart_max_range := GREATEST(IFNULL(@multi_line_chart_max_range, ordering_column), ordering_column) AS max_range
              FROM
                (SELECT
                  *,
                  SUBSTRING_INDEX(SUBSTRING_INDEX(row_values, ',', tinyint_asc.value), ',', -1)+0 AS row_value,
                  @multi_line_chart_num_values := CHAR_LENGTH(row_values)-CHAR_LENGTH(REPLACE(row_values,',',''))+1 AS num_values
                FROM
                  (SELECT @multi_line_chart_values_legend := NULL) AS select_nullify_values_legend,
                  (SELECT @multi_line_chart_bar_length := NULL) AS select_nullify_bar_length,
                  tinyint_asc,
                  (
                    SELECT
                      value/10 AS ordering_column,
                      CONCAT(SIN(value/10), ',', COS(value/10), ',', SIGN(TAN(value/10))/2) AS row_values,
                      @multi_line_chart_values_legend := 'sin(x),cos(x),sign(tan(x))/2'
                    FROM tinyint_asc LIMIT 81
                  ) AS sel_main_values,
                  (SELECT @multi_line_chart_min_value := NULL) AS select_min,
                  (SELECT @multi_line_chart_max_value := NULL) AS select_max,
                  (SELECT @multi_line_chart_min_range := NULL) AS select_min_range,
                  (SELECT @multi_line_chart_max_range := NULL) AS select_max_range,
                  (SELECT @multi_line_chart_graph_colors := '#*@%o+x;m:') AS select_graph_colors,
                  (SELECT @multi_line_chart_graph_fallback_colors := 'abcdefghij') AS select_graph_fallback_colors,
                  (SELECT @multi_line_chart_value_precision := 2) AS select_value_precision,
                  (SELECT @multi_line_chart_graph_rows := 17) AS select_graph_rows
                ) sel_counted_values_main_values
              WHERE
                value BETWEEN 1 AND @multi_line_chart_num_values
              ) sel_row_values
            ) AS sel_row_values_indicators
          WHERE
            tinyint_asc.value BETWEEN 1 AND @multi_line_chart_graph_rows
          ) AS sel_marked_row_values
        GROUP BY
          ordering_column, string_position
        ) AS sel_walked_bar
      GROUP BY
        ordering_column
    ) AS select_vertical
    WHERE
      tinyint_asc.value BETWEEN 1 AND CHAR_LENGTH(unwalked_bar)
    GROUP BY
      tinyint_asc.value
    ORDER BY
      tinyint_asc.value DESC
    ) AS select_horizontal
  ) AS select_horizontal_untitled
UNION ALL
SELECT '', CONCAT('v', REPEAT(':', @multi_line_chart_bar_length-2), 'v')
UNION ALL
SELECT '', CONCAT(@multi_line_chart_min_range, REPEAT(' ', @multi_line_chart_bar_length-CHAR_LENGTH(@multi_line_chart_min_range)-CHAR_LENGTH(@multi_line_chart_max_range)), @multi_line_chart_max_range)
UNION ALL
SELECT
  '', CONCAT('    ', SUBSTRING(@multi_line_chart_graph_colors, value, 1), ' ', SUBSTRING_INDEX(SUBSTRING_INDEX(@multi_line_chart_values_legend, ',', value), ',', -1))
FROM
  tinyint_asc
WHERE
  value BETWEEN 1 AND @multi_line_chart_num_values
  AND @multi_line_chart_values_legend IS NOT NULL
;

Area charts

Sinus functions curves go nicely with ASCII art. But your everyday chart won’t have such nice curves. There is some erratic behavior to, say, your number of queries per hour. It doesn’t display well in the above graph.

Which is why we can choose to present area graphs: not only display the line, but also the area beneath it.

This leads to the problem of hiding: higher values can hide the lower values display. So we need to adjust the above logic and make sure we always bring to front the lower values.

The following example shows the relation between the percentage of slow queries (per hour) and the percentage of full table scans (of total queries, per hour). It’s really fun to see the unmistakeable connection!

+---------+---------------------------------------------------------------+
| y_scale | chart                                                         |
+---------+---------------------------------------------------------------+
| 4       | --------------------------------*--*------------------------- |
| 3       | -----------------------------****-*****------------------*--- |
| 3       | -----------------------------**********-----------------**-*- |
| 3       | ----------*------------------**********--------------*****-** |
| 3       | ------*--**-----------------************-*-----------******** |
| 3       | ------*******---------------****************--------********* |
| 3       | ------*******---------------******************------********* |
| 3       | ------*********-------------******************------********* |
| 3       | -----************--------*-*******************------********* |
| 2       | *----*****************--************************-**-********* |
| 2       | *----******************************************************** |
| 2       | *----******************************************************** |
| 2       | **--********************************************************* |
| 2       | ************************************************************* |
| 2       | ************************************************************* |
| 2       | ************************************************************* |
| 2       | *****************************#******************************* |
| 1       | *****************************##########*****************##*#* |
| 1       | ******#***#*****************###########**************######## |
| 1       | ******#######***************###############*********######### |
| 1       | #*****########**************##################******######### |
| 1       | #****##########*#*******##*###################******######### |
| 1       | #***######################################################### |
| 1       | #***######################################################### |
| 1       | ############################################################# |
|         | v:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::v |
|         | 2009-10-22 20:00:00                       2009-10-25 08:00:00 |
|         |     # slow queries percent                                    |
|         |     * select scan percent                                     |
+---------+---------------------------------------------------------------

Next we look at the number of SELECT, INSERT, UPDATE, DELETE queries per sec – on a per hour basis.

+---------+---------------------------------------------------------------+
| y_scale | chart                                                         |
+---------+---------------------------------------------------------------+
| 135     | --##--------------------------------------------------------- |
| 130     | -###--------------------------------------------------------- |
| 125     | -###----------------------##--------------------------------- |
| 120     | -###-------------------#####--------------------------------- |
| 115     | -###-----------------#######--------------------------------- |
| 110     | -###----------------########----------------------##--------- |
| 104     | -###--------------##########--------------------####--------- |
| 99      | ####*--------#--############------------------######--------- |
| 94      | ####*--------#-#############---------------#########--------- |
| 89      | #####-----##################------------############--------# |
| 84      | ############################-----------#############--------# |
| 79      | ####%########################---####################---###### |
| 74      | ####%######################################################## |
| 69      | ####%*####################################################### |
| 64      | ####%*####################################################### |
| 59      | ##*#%*###############*####################################### |
| 54      | #**#%*#####*#*****###*****##*###########*#####*#####*#####*## |
| 49      | #***%%*#*********************##########****#*********#**##*** |
| 44      | #***%%*************************##**************************** |
| 39      | ****%%******************************************************* |
| 33      | ****%%**********************************************%******** |
| 28      | *%%*%%**********%******%%*%*%*******%***************%******** |
| 23      | %%%%o%**%**%%o%%%%%%%%%%%%%%o******%%*%%%%%%%%%%%%%%%%*****%% |
| 18      | %oooooooo%%oooooooooooooooooo%%oo%oooooooooooooooooooo%o%%ooo |
| 13      | ooooooooo%%oooooooooooooooooo%%%%%%ooooooooooooooooooo%o%%ooo |
|         | v:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::v |
|         | 2009-10-22 20:00:00                       2009-10-25 08:00:00 |
|         |     # com_select per second                                   |
|         |     * com_insert per second                                   |
|         |     o com_update per second                                   |
|         |     % com_delete per second                                   |
+---------+---------------------------------------------------------------+

Here’s the code to do that.

Colored charts

Time to add colors! The previous charts can be easily improved by sending the terminal escape characters to instruct initiating color display (thanks to TheVoo for providing the idea and implementation for that on SQL pie chart).

I recommend using the following pager if you’re on unix-like OS:

pager awk '{sub(/..$/, "")} {sub(/[ ]+$/, "")} !/^[+][-]/ && NR > 3 { print }'

The following chart is similar to the above, but now utilizes colors. It presents the number of popular DML queries per second, on an hourly basis.

Colored multi area chart

Again, comparing slow queries with full table scans:

Slow queries and full scans overlay
Slow queries and full scans overlay

Comparing number of created temporary tables with number of created disk temporary tables:

temp tables vs disk temp tables
temp tables vs disk temp tables

And, once again, the number of popular DML statements, zoomed in.

color_multi_line_chart_dml

Here’s the code for colored charts.

Some of the techinuqes used in my charting series of blogs can be used, pretty much in the same way, in order to generate Google charts. But other techniques can be used, as well.

More on this in future posts.

]]>
https://shlomi-noach.github.io/blog/mysql/sql-multi-line-chart/feed 4 1400
SQL pie chart https://shlomi-noach.github.io/blog/mysql/sql-pie-chart https://shlomi-noach.github.io/blog/mysql/sql-pie-chart#comments Wed, 12 Aug 2009 10:49:41 +0000 https://shlomi-noach.github.io/blog/?p=1093 My other half says I’m losing it. But I think that as an enthusiast kernel developer she doesn’t have the right to criticize people. (“I like user space better!” – she exclaims upon reading this).

Shown below is a (single query) SQL-generated pie chart. I will walk through the steps towards making this happen, and conclude with what, I hope you’ll agree, are real-world, useful usage samples.

+----------------------------------------------------------------------+
| pie_chart                                                            |
+----------------------------------------------------------------------+
|                                                                      |
|                         ;;;;;;;;;;;;;;;;;;;;;                        |
|                  oooooooo;;;;;;;;;;;;;;;;;;;;;;;;;;;                 |
|             oooooooooooooo;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;            |
|          ooooooooooooooooo                 ;;;;;;;;;;;;#####         |
|        oooooooooooooo                           ;#############       |
|       oooooooooooo                                 ############      |
|      oooooooooooo                                   ############     |
|      ooooooooooo                                     ###########     |
|      oooooooooooo                                   ::::::::::::     |
|       oooooooooooo                                 ::::::::::::      |
|        ooooooooo:::::                           ::::::::::::::       |
|          o::::::::::::::::                 :::::::::::::::::         |
|             :::::::::::::::::::::::::::::::::::::::::::::            |
|                  :::::::::::::::::::::::::::::::::::                 |
|                         :::::::::::::::::::::                        |
|                                                                      |
| ##  red: 1 (10%)                                                     |
| ;;  blue: 2 (20%)                                                    |
| oo  orange: 3 (30%)                                                  |
| ::  white: 4 (40%)                                                   |
+----------------------------------------------------------------------+

Requirements

We need a generic query, which returns at least these two columns: name_column and value_column. For example, the following query will do:

SELECT name AS name_column, value AS value_column FROM sample_values2;
+-------------+--------------+
| name_column | value_column |
+-------------+--------------+
| red         |            1 |
| blue        |            2 |
| orange      |            3 |
| white       |            4 |
+-------------+--------------+

Find sample data in pie_data.sql.

Part 1: expanding the original query

We’re going to need to take the above query’s results and expand them: how much is the ratio from total, per value? As first step, accumulate values:

SELECT
  name_column,
  value_column,
  @accumulating_value := @accumulating_value+value_column AS accumulating_value
FROM (
  SELECT name AS name_column, value AS value_column FROM sample_values2
  ) select_values,
  (SELECT @accumulating_value := 0) select_accumulating_value
;
+-------------+--------------+--------------------+
| name_column | value_column | accumulating_value |
+-------------+--------------+--------------------+
| red         |            1 |                  1 |
| blue        |            2 |                  3 |
| orange      |            3 |                  6 |
| white       |            4 |                 10 |
+-------------+--------------+--------------------+

Next, we calculate ratio of accumulating value, and present it both in [0..1] range, as well as in [0..2*PI] (radians):

SELECT
  name_order,
  name_column,
  value_column,
  accumulating_value,
  accumulating_value/@accumulating_value AS accumulating_value_ratio,
  2*PI()*accumulating_value/@accumulating_value AS accumulating_value_radians
FROM (
  SELECT
    name_column,
    value_column,
    @name_order := @name_order+1 AS name_order,
    @accumulating_value := @accumulating_value+value_column AS accumulating_value,
    @aggregated_name_column := CONCAT(@aggregated_name_column, name_column, ',') AS aggregated_name_column
  FROM (
    SELECT name AS name_column, value AS value_column FROM sample_values2
    ) select_values,
    (SELECT @name_order := 0) select_name_order,
    (SELECT @accumulating_value := 0) select_accumulating_value,
    (SELECT @aggregated_name_column := '') select_aggregated_name_column
  ) select_accumulating_values
;
+------------+-------------+--------------+--------------------+--------------------------+----------------------------+
| name_order | name_column | value_column | accumulating_value | accumulating_value_ratio | accumulating_value_radians |
+------------+-------------+--------------+--------------------+--------------------------+----------------------------+
|          1 | red         |            1 |                  1 |                      0.1 |           0.62831853071796 |
|          2 | blue        |            2 |                  3 |                      0.3 |            1.8849555921539 |
|          3 | orange      |            3 |                  6 |                      0.6 |            3.7699111843078 |
|          4 | white       |            4 |                 10 |                        1 |            6.2831853071796 |
+------------+-------------+--------------+--------------------+--------------------------+----------------------------+

The radians value will help us decide where in the pie chart lies each value.

Part 2: behind the scenes of the pie chart

We now explain how the pie chart works. Later on we combine with Part 1, to produce the complete chart.

We first generate a coordinates system (see SQL graphics):

SELECT
  GROUP_CONCAT(CONCAT(t2.value,'.',t1.value) order by t1.value separator ' ') as circle
FROM
  tinyint_asc t1,
  tinyint_asc t2,
  (select @size := 10) sel_size,
  (select @radius := (@size/2 - 1)) sel_radius
WHERE
  t1.value < @size
  AND t2.value < @size
GROUP BY t2.value
;
+-----------------------------------------+
| circle                                  |
+-----------------------------------------+
| 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 |
| 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 |
| 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 |
| 3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 |
| 4.0 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 |
| 5.0 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 |
| 6.0 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 |
| 7.0 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 |
| 8.0 8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9 |
| 9.0 9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9 |
+-----------------------------------------+

Taking a slightly big step further, we calculate the angle per coordinate, in relation to center of coordinate system. Calculation is in radians, but presented in degrees, since it’s more readable. Also, we note in which quarter of the graph each point lies.

SELECT
  group_concat(
    round(radians*180/PI())
    order by col_number separator ' ') as circle
FROM (
  SELECT
    t1.value AS col_number,
    t2.value AS row_number,
    @dx := (t1.value - (@size-1)/2) AS dx,
    @dy := ((@size-1)/2 - t2.value) AS dy,
    @abs_radians := IF(@dx = 0, PI()/2, (atan(abs(@dy/@dx)))) AS abs_radians,
    CASE
      WHEN SIGN(@dy) >= 0 AND SIGN(@dx) >= 0 THEN @abs_radians
      WHEN SIGN(@dy) >= 0 AND SIGN(@dx) <= 0 THEN PI()-@abs_radians
      WHEN SIGN(@dy) <= 0 AND SIGN(@dx) <= 0 THEN PI()+@abs_radians
      WHEN SIGN(@dy) <= 0 AND SIGN(@dx) >= 0 THEN 2*PI()-@abs_radians
    END AS radians
  FROM
    tinyint_asc t1,
    tinyint_asc t2,
    (select @size := 15) sel_size,
    (select @radius := (@size/2 - 1)) sel_radius
  WHERE
    t1.value < @size
    AND t2.value < @size) select_combinations
  GROUP BY row_number
;
+-------------------------------------------------------------+
| circle                                                      |
+-------------------------------------------------------------+
| 135 131 126 120 113 106 98 90 82 74 67 60 54 49 45          |
| 139 135 130 124 117 108 99 90 81 72 63 56 50 45 41          |
| 144 140 135 129 121 112 101 90 79 68 59 51 45 40 36         |
| 150 146 141 135 127 117 104 90 76 63 53 45 39 34 30         |
| 157 153 149 143 135 124 108 90 72 56 45 37 31 27 23         |
| 164 162 158 153 146 135 117 90 63 45 34 27 22 18 16         |
| 172 171 169 166 162 153 135 90 45 27 18 14 11 9 8           |
| 180 180 180 180 180 180 180 90 0 0 0 0 0 0 0                |
| 188 189 191 194 198 207 225 270 315 333 342 346 349 351 352 |
| 196 198 202 207 214 225 243 270 297 315 326 333 338 342 344 |
| 203 207 211 217 225 236 252 270 288 304 315 323 329 333 337 |
| 210 214 219 225 233 243 256 270 284 297 307 315 321 326 330 |
| 216 220 225 231 239 248 259 270 281 292 301 309 315 320 324 |
| 221 225 230 236 243 252 261 270 279 288 297 304 310 315 319 |
| 225 229 234 240 247 254 262 270 278 286 293 300 306 311 315 |
+-------------------------------------------------------------+

The above needs some formattign to present well, but that’s not the purpose; I’m only showing the above to explain the steps taken.

Part 3: combining the two

Next step is probably the most significant one: we’re going to present a rough, square, weird looking pie chart using the original values:

SELECT
  group_concat(
    (SELECT name_order FROM
      (
      SELECT
        name_order,
        name_column,
        value_column,
        accumulating_value,
        accumulating_value/@accumulating_value AS accumulating_value_ratio,
        2*PI()*accumulating_value/@accumulating_value AS accumulating_value_radians
      FROM (
        SELECT
          name_column,
          value_column,
          @name_order := @name_order+1 AS name_order,
          @accumulating_value := @accumulating_value+value_column AS accumulating_value,
          @aggregated_name_column := CONCAT(@aggregated_name_column, name_column, ',') AS aggregated_name_column
        FROM (
          SELECT name AS name_column, value AS value_column FROM sample_values2
          ) select_values,
          (SELECT @name_order := 0) select_name_order,
          (SELECT @accumulating_value := 0) select_accumulating_value,
          (SELECT @aggregated_name_column := '') select_aggregated_name_column
        ) select_accumulating_values
      ) select_for_radians
    WHERE accumulating_value_radians >= radians LIMIT 1
    )
    order by col_number separator ' ') as circle
FROM (
  SELECT
    t1.value AS col_number,
    t2.value AS row_number,
    @dx := (t1.value - (@size-1)/2) AS dx,
    @dy := ((@size-1)/2 - t2.value) AS dy,
    @abs_radians := IF(@dx = 0, PI()/2, (atan(abs(@dy/@dx)))) AS abs_radians,
    CASE
      WHEN SIGN(@dy) >= 0 AND SIGN(@dx) >= 0 THEN @abs_radians
      WHEN SIGN(@dy) >= 0 AND SIGN(@dx) <= 0 THEN PI()-@abs_radians
      WHEN SIGN(@dy) <= 0 AND SIGN(@dx) <= 0 THEN PI()+@abs_radians
      WHEN SIGN(@dy) <= 0 AND SIGN(@dx) >= 0 THEN 2*PI()-@abs_radians
    END AS radians
  FROM
    tinyint_asc t1,
    tinyint_asc t2,
    (select @size := 21) sel_size,
    (select @radius := (@size/2 - 1)) sel_radius
  WHERE
    t1.value < @size
    AND t2.value < @size) select_combinations
  GROUP BY row_number
;
+-------------------------------------------+
| circle                                    |
+-------------------------------------------+
| 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 |
| 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 2 2 2 2 |
| 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 2 2 2 2 |
| 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 2 2 2 1 |
| 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 2 1 1 |
| 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 1 1 1 1 |
| 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 1 1 1 1 1 |
| 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 1 1 1 1 1 1 |
| 3 3 3 3 3 3 3 3 3 3 2 2 2 1 1 1 1 1 1 1 1 |
| 3 3 3 3 3 3 3 3 3 3 2 2 1 1 1 1 1 1 1 1 1 |
| 3 3 3 3 3 3 3 3 3 3 2 1 1 1 1 1 1 1 1 1 1 |
| 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 |
| 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 |
| 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 |
| 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 |
| 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 |
| 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 |
| 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 |
| 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 |
| 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 |
| 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 |
+-------------------------------------------+

The interesting SQL trick is that everything goes within the GROUP_CONCAT clause. Instead of presenting the coordinate, we check on the enhanced values table, looking for the first row which has a greater angle (in radians) than the current pixel has. We then display 1, 2, etc. to denote the value.

The next step is actually very simple: instead of drawing the full square, limit to a circle!

SELECT
  group_concat(
    IF(round(sqrt(pow(col_number-(@size-1)/2, 2) + pow(row_number-(@size-1)/2, 2))) BETWEEN @radius/2 AND @radius,
    (SELECT name_order FROM
      (
      SELECT
        name_order,
        name_column,
        value_column,
        accumulating_value,
        accumulating_value/@accumulating_value AS accumulating_value_ratio,
        2*PI()*accumulating_value/@accumulating_value AS accumulating_value_radians
      FROM (
        SELECT
          name_column,
          value_column,
          @name_order := @name_order+1 AS name_order,
          @accumulating_value := @accumulating_value+value_column AS accumulating_value,
          @aggregated_name_column := CONCAT(@aggregated_name_column, name_column, ',') AS aggregated_name_column
        FROM (
          SELECT name AS name_column, value AS value_column FROM sample_values2
          ) select_values,
          (SELECT @name_order := 0) select_name_order,
          (SELECT @accumulating_value := 0) select_accumulating_value,
          (SELECT @aggregated_name_column := '') select_aggregated_name_column
        ) select_accumulating_values
      ) select_for_radians
    WHERE accumulating_value_radians >= radians LIMIT 1
    ), '-')
    order by col_number separator ' ') as circle
FROM (
  SELECT
    t1.value AS col_number,
    t2.value AS row_number,
    @dx := (t1.value - (@size-1)/2) AS dx,
    @dy := ((@size-1)/2 - t2.value) AS dy,
    @abs_radians := IF(@dx = 0, PI()/2, (atan(abs(@dy/@dx)))) AS abs_radians,
    CASE
      WHEN SIGN(@dy) >= 0 AND SIGN(@dx) >= 0 THEN @abs_radians
      WHEN SIGN(@dy) >= 0 AND SIGN(@dx) <= 0 THEN PI()-@abs_radians
      WHEN SIGN(@dy) <= 0 AND SIGN(@dx) <= 0 THEN PI()+@abs_radians
      WHEN SIGN(@dy) <= 0 AND SIGN(@dx) >= 0 THEN 2*PI()-@abs_radians
    END AS radians
  FROM
    tinyint_asc t1,
    tinyint_asc t2,
    (select @size := 21) sel_size,
    (select @radius := 7) sel_radius
  WHERE
    t1.value < @size
    AND t2.value < @size) select_combinations
  GROUP BY row_number
;
+-------------------------------------------+
| circle                                    |
+-------------------------------------------+
| - - - - - - - - - - - - - - - - - - - - - |
| - - - - - - - - - - - - - - - - - - - - - |
| - - - - - - - - - - - - - - - - - - - - - |
| - - - - - - - - 2 2 2 2 2 - - - - - - - - |
| - - - - - - 3 3 3 2 2 2 2 2 2 - - - - - - |
| - - - - - 3 3 3 3 2 2 2 2 2 2 2 - - - - - |
| - - - - 3 3 3 3 3 2 2 2 2 2 2 2 1 - - - - |
| - - - - 3 3 3 3 3 - - - 2 2 2 1 1 - - - - |
| - - - 3 3 3 3 3 - - - - - 1 1 1 1 1 - - - |
| - - - 3 3 3 3 - - - - - - - 1 1 1 1 - - - |
| - - - 3 3 3 3 - - - - - - - 1 1 1 1 - - - |
| - - - 3 3 3 3 - - - - - - - 4 4 4 4 - - - |
| - - - 3 3 3 3 3 - - - - - 4 4 4 4 4 - - - |
| - - - - 3 3 4 4 4 - - - 4 4 4 4 4 - - - - |
| - - - - 3 4 4 4 4 4 4 4 4 4 4 4 4 - - - - |
| - - - - - 4 4 4 4 4 4 4 4 4 4 4 - - - - - |
| - - - - - - 4 4 4 4 4 4 4 4 4 - - - - - - |
| - - - - - - - - 4 4 4 4 4 - - - - - - - - |
| - - - - - - - - - - - - - - - - - - - - - |
| - - - - - - - - - - - - - - - - - - - - - |
| - - - - - - - - - - - - - - - - - - - - - |
+-------------------------------------------+

That looks a lot more like a pie chart.

Part 4: doing the fancy work

We will now add (in one big step):

  • Stretching along the X-axis.
  • Condensing the spaces.
  • Coloring for the pie parts.
  • A legend.

The text in bold is the original query, and is the only thing you need to change in order to create your own pie charts.

SELECT
  group_concat(
    IF(round(sqrt(pow(col_number/@stretch-0.5-(@size-1)/2, 2) + pow(row_number-(@size-1)/2, 2))) BETWEEN @radius*2/3 AND @radius,
    (SELECT SUBSTRING(@colors, name_order, 1) FROM
      (
      SELECT
        name_order,
        name_column,
        value_column,
        accumulating_value,
        accumulating_value/@accumulating_value AS accumulating_value_ratio,
        @aggregated_data := CONCAT(@aggregated_data, name_column, ': ', value_column, ' (', ROUND(100*value_column/@accumulating_value), '%)', '|') AS aggregated_name_column,
        2*PI()*accumulating_value/@accumulating_value AS accumulating_value_radians
      FROM (
        SELECT
          name_column,
          value_column,
          @name_order := @name_order+1 AS name_order,
          @accumulating_value := @accumulating_value+value_column AS accumulating_value
        FROM (
          SELECT name AS name_column, value AS value_column FROM sample_values2 LIMIT 4
          ) select_values,
          (SELECT @name_order := 0) select_name_order,
          (SELECT @accumulating_value := 0) select_accumulating_value,
          (SELECT @aggregated_data := '') select_aggregated_name_column
        ) select_accumulating_values
      ) select_for_radians
    WHERE accumulating_value_radians >= radians LIMIT 1
    ), ' ')
    order by col_number separator '') as pie
FROM (
  SELECT
    t1.value AS col_number,
    t2.value AS row_number,
    @dx := (t1.value/@stretch - (@size-1)/2) AS dx,
    @dy := ((@size-1)/2 - t2.value) AS dy,
    @abs_radians := IF(@dx = 0, PI()/2, (atan(abs(@dy/@dx)))) AS abs_radians,
    CASE
      WHEN SIGN(@dy) >= 0 AND SIGN(@dx) >= 0 THEN @abs_radians
      WHEN SIGN(@dy) >= 0 AND SIGN(@dx) <= 0 THEN PI()-@abs_radians
      WHEN SIGN(@dy) <= 0 AND SIGN(@dx) <= 0 THEN PI()+@abs_radians
      WHEN SIGN(@dy) <= 0 AND SIGN(@dx) >= 0 THEN 2*PI()-@abs_radians
    END AS radians
  FROM
    tinyint_asc t1,
    tinyint_asc t2,
    (select @size := 23) sel_size,
    (select @radius := (@size/2 - 1)) sel_radius,
    (select @stretch := 4) sel_stretch,
    (select @colors := '#;o:X"@+-=123456789abcdef') sel_colors
  WHERE
    t1.value < @size*@stretch
    AND t2.value < @size) select_combinations
  GROUP BY row_number
UNION ALL
  SELECT
    CONCAT(
      REPEAT(SUBSTRING(@colors, value, 1), 2),
      '  ',
      SUBSTRING_INDEX(SUBSTRING_INDEX(@aggregated_data, '|', value), '|', -1)
    )
  FROM
    tinyint_asc
  WHERE
    value BETWEEN 1 AND @name_order
;
+----------------------------------------------------------------------------------------------+
| pie                                                                                          |
+----------------------------------------------------------------------------------------------+
|                                                                                              |
|                                   ;;;;;;;;;;;;;;;;;;;;;;;;;                                  |
|                          oooooooo;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;                         |
|                    ooooooooooooooo;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;                   |
|                oooooooooooooooooooo;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;               |
|             oooooooooooooooooooooooo                     ;;;;;;;;;;;;;;;;;;;;;###            |
|           oooooooooooooooooooo                                 ;;;;;;;;;###########          |
|         oooooooooooooooooo                                         ##################        |
|       ooooooooooooooooo                                               #################      |
|      ooooooooooooooooo                                                 #################     |
|      oooooooooooooooo                                                   ################     |
|     oooooooooooooooo                                                     ################    |
|      oooooooooooooooo                                                   ::::::::::::::::     |
|      ooooooooooooooooo                                                 :::::::::::::::::     |
|       ooooooooooooooooo                                               :::::::::::::::::      |
|         oooooooooooooo::::                                         ::::::::::::::::::        |
|           ooooooo:::::::::::::                                 ::::::::::::::::::::          |
|             ::::::::::::::::::::::::                     ::::::::::::::::::::::::            |
|                :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::               |
|                    :::::::::::::::::::::::::::::::::::::::::::::::::::::::                   |
|                          :::::::::::::::::::::::::::::::::::::::::::                         |
|                                   :::::::::::::::::::::::::                                  |
|                                                                                              |
| ##  red: 1 (10%)                                                                             |
| ;;  blue: 2 (20%)                                                                            |
| oo  orange: 3 (30%)                                                                          |
| ::  white: 4 (40%)                                                                           |
+----------------------------------------------------------------------------------------------+

Making the legend is by itself an interesting hack: aggregating an unknown number of rows using a session user variable, then splitting it again. Of course, this works well if we only have a small number of rows (values), as we expect in our case.

Showcase

OK. Why? Why?

Here are three charts I hope will convince the skeptic reader:

Given the sakila database, list the 8 largest tables (size in KB):

+------------------------------------------------------------------------------+
| pie_chart                                                                    |
+------------------------------------------------------------------------------+
|                                                                              |
|                            #######################                           |
|                    #######################################                   |
|               #################################################              |
|            ##################                   ##################           |
|         ###############                               ###############        |
|        ;;###########                                     #############       |
|      ;;;;;;;;;;;;;                                         #############     |
|      ;;;;;;;;;;;;                                           ############     |
|     ;;;;;;;;;;;;;                                           #############    |
|      ;;;;;;;;;;;;                                           @@@@@@@@@@@@     |
|      ;;;;;;;;;;;;;                                         """""""""""""     |
|        ;;;;;;;;;;;;;                                     XX"""""""""""       |
|         ;;;;;;;;;;;;;;;                               XXXXXXXXXXXXX""        |
|            ;;;;;;;;;;;;;;;;;;                   ::::::::::XXXXXXXX           |
|               ;;;;;;;;;;;;;;;;;;;;;;;;;;oooooooooo::::::::::::X              |
|                    ;;;;;;;;;;;;;;;;;;;;;;ooooooooooo::::::                   |
|                            ;;;;;;;;;;;;;;ooooooooo                           |
|                                                                              |
| ##  rental: 2850816 (43%)                                                    |
| ;;  payment: 2228224 (34%)                                                   |
| oo  inventory: 376832 (6%)                                                   |
| ::  film_text: 325440 (5%)                                                   |
| XX  film: 278528 (4%)                                                        |
| ""  film_actor: 278528 (4%)                                                  |
| @@  customer: 131072 (2%)                                                    |
| ++  staff: 98304 (1%)                                                        |
+------------------------------------------------------------------------------+

How much disk space does each storage engine consume (sum table size per engine)?

+------------------------------------------------------------------------------+
| pie_chart                                                                    |
+------------------------------------------------------------------------------+
|                                                                              |
|                            #######################                           |
|                    #######################################                   |
|               #################################################              |
|            ##################                   ##################           |
|         ###############                               ###############        |
|        #############                                     #############       |
|      #############                                         #############     |
|      ############                                           ############     |
|     #############                                           #############    |
|      ############                                           oooooooooooo     |
|      #############                                         ;;;;;;;;;;;oo     |
|        #############                                     ;;;;;;;;;;;;;       |
|         ###############                               ;;;;;;;;;;;;;;;        |
|            ##################                   #;;;;;;;;;;;;;;;;;           |
|               #####################################;;;;;;;;;;;;              |
|                    ###################################;;;;                   |
|                            #######################                           |
|                                                                              |
| ##  InnoDB: 1908732 (84%)                                                    |
| ;;  MyISAM: 284074 (12%)                                                     |
| oo  ARCHIVE: 84276 (4%)                                                      |
+------------------------------------------------------------------------------+

What were the most popular DMLs during the last 10 seconds?

+------------------------------------------------------------------------------+
| pie_chart                                                                    |
+------------------------------------------------------------------------------+
|                                                                              |
|                            #######################                           |
|                    #######################################                   |
|               #################################################              |
|            ##################                   ##################           |
|         ###############                               ###############        |
|        #############                                     #############       |
|      #############                                         #############     |
|      ############                                           ############     |
|     #############                                           #############    |
|      ############                                           oooo::::::::     |
|      #############                                         ooooooooooooo     |
|        #############                                     ooooooooooooo       |
|         ###############                               ooooooooooooooo        |
|            #################;                   ;;;;;;;;;ooooooooo           |
|               #############;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;oo              |
|                    ######;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;                   |
|                            ;;;;;;;;;;;;;;;;;;;;;;;                           |
|                                                                              |
| ##  com_select: 1876 (69%)                                                   |
| ;;  com_insert: 514 (19%)                                                    |
| oo  com_delete: 277 (10%)                                                    |
| ::  com_update: 63 (2%)                                                      |
+------------------------------------------------------------------------------+

Conclusion

ASCII graphics always look funny; some would say the same about top, wgetcal, etc. (should I even mention lynx?)

I think it is possible to do most common charting with SQL: I’ve already shown how to do horizontal graphs and pie charts. Multi-column bar charts can also be worked out. These are not meant as a permanent solution; but it’s good to be able to visualize some values without having to install Nagios (along with apache, php, drivers, etc.), or otherwise exporting table, copying to desktop machines, loading into OpenOffice impress, generating graphs.

Sometimes you just need an immediate overlook. This is where I find SQL charting to be useful.

Sure, there are Perl and Python solutions for that; that’s easily achieved as well. But doing it from with the MySQL client gives, in my opinion, a level of confidence: you’ll always be able to produce the graph; perl-DBD-MySQL or no perl-DBD-MySQL; Linux or Windows.

Besides, it was fun doing it.

]]>
https://shlomi-noach.github.io/blog/mysql/sql-pie-chart/feed 80 1093