Generating Google line charts with SQL, part I

February 1, 2011

In this series of posts I wish to show how Google Charts can be generated via SQL. We discuss the Google Charts limitations which must be challenged, and work towards a simple chart.

I'm going to present the algorithm I use in mycheckpoint, a MySQL monitoring utility, which generates Google charts by raw data using views. An example of such chart follows:



http://chart.apis.google.com/chart?cht=lc&chs=370x180&chts=303030,12&chtt=Latest+24+hours:+Nov+9,+05:50++-++Nov+10,+05:50&chf=c,s,ffffff&chdl=Rentals+rate:+custom_1_psec&chdlp=b&chco=ff8c00&chd=s:GDGKGFLFGMJHRLMPPNULJRPLTOPRUMYPPVRNbQUSUSbSNWUOfSWTObVSUVWSVYVPbTPjfTbRTdXReUWhcTQRQZbTWYVYPaVZXdYYWPTabYUTbW99QLgLNIOIRNNMIKRJEHGFHGJGGFIFDFGDK&chxt=x,y&chxr=1,0,8.720000&chxl=0:|+||08:00||+||12:00||+||16:00||+||20:00||+||00:00||+||04:00||&chxs=0,505050,10,0,lt&chg=4.17,25,1,2,0.69,0&chxp=0,0.69,4.86,9.03,13.20,17.37,21.54,25.71,29.88,34.05,38.22,42.39,46.56,50.73,54.90,59.07,63.24,67.41,71.58,75.75,79.92,84.09,88.26,92.43,96.60&tsstart=2010-11-09+05:50:00&tsstep=600

mycheckpoint does not actually call on Google to do the chart rendering, but invokes its own JavaScript code to visualize the URL locally.

Here are some downsides for using Google charts:

  • The URL cannot be as long as you like. 2048 characters is an upper bound you'll want to keep behind. [Google charts POST method calls are available, which leads to 16K equivalent of URL length -- this is still not too helpful due to the nature of POST calls]
  • Features are inconsistent. To specify label or tick positions, one must specify exact positions. To specify grid positions, one must supply with step, offset, etc. There are more such inconsistencies.
  • Google charts are not too friendly. Taking the ticks and grids example from above, there really shouldn't be a reason why grids would not be automatically generated according to ticks definitions. But we are required to specify positions for the ticks as well as for the grids.
  • There is no support for time-series. One must translate time as x-axis values.
  • Perhaps most intimidating to many people: to generate a Google chart, once must send data to Google. Which is the main reason I used local JavaScript rendering.

Anyway, let's build a very simple chart. Since I will not cover everything in this post, we make for some relaxed conditions.

Sample data

Consider the chart_data table, below:

SELECT * FROM chart_data;
+-----+---------------------+-------------------+
| id  | ts                  | data              |
+-----+---------------------+-------------------+
|   1 | 2011-02-01 09:59:15 |  1.99666833293656 |
|   2 | 2011-02-01 09:59:15 |  7.94677323180245 |
|   3 | 2011-02-01 09:59:15 |  17.7312123996804 |
|   4 | 2011-02-01 09:59:15 |   31.153467384692 |
|   5 | 2011-02-01 09:59:15 |  47.9425538604203 |
|   6 | 2011-02-01 09:59:15 |  67.7570968074042 |
|   7 | 2011-02-01 09:59:15 |  90.1904762132767 |
|   8 | 2011-02-01 09:59:15 |  114.776974543924 |
|   9 | 2011-02-01 09:59:15 |  140.998843732947 |
|  10 | 2011-02-01 09:59:15 |  168.294196961579 |
|  11 | 2011-02-01 09:59:15 |  196.065619213516 |
|  12 | 2011-02-01 09:59:15 |  223.689380632134 |
|  13 | 2011-02-01 09:59:15 |   250.52512820847 |
...

Find complete SQL script in google_charts.sql

In the above table we assume the timestamp values are evenly distributed. This is because of the aforementioned fact that Google Charts do not support time-series. So, if the given timnestamps are not evenly distributed, or maybe some values are skipped, we must manually compensate for that. In mycheckpoint I do that. In this blog post, we skip it.

Axis boundaries

By default, Google Charts expect data to lie within the range 0..100. We need to reset the scale to match our minimum/maximum values, and lay out our values within that range.

We will be using views in this solution, and here is the first view:

CREATE OR REPLACE VIEW
 chart_data_minmax
AS
 SELECT MIN(data) AS min_value, MAX(data) AS max_value
FROM
 chart_data
;

In our example we will just settle with the min/max values. In real worlds we may:

  • Choose 0 to be the lower value, even if the minimal value is positive
  • Choose to round min-max values to a nearest power of 10.

The axis boundaries go under the following parameters:

  • chxt=x,y (we hard code bottom axis, left axis)
  • chxr=1,min,max (with min, max as computed by the above)

The following query produces these params:

SELECT CONCAT('&chxt=x,y&chxr=1,', ROUND(min_value, 1), ',', ROUND(max_value, 1)) AS minmax FROM chart_data_minmax;
+---------------------------------+
| minmax                          |
+---------------------------------+
| &chxt=x,y&chxr=1,-4716.6,5340.0 |
+---------------------------------+

Data values

Next, we have some 244 rows of data. Data is double. Can we be certain this all fits within 2048 characters? What if we had more rows?

Alas, we must make a trade-off. We will sacrifice accuracy in favor of a short URL. We will use simple encoding for our data. This means just one character per value, encoded A-Za-z0-9. This means just 62 distinct values.

So we must translate each data value into the range 0..61, and then into a character. First step is:

SELECT (data - min_value)/(max_value - min_value) AS range_01 FROM chart_data, chart_data_minmax;
+---------------------+
| range_01            |
+---------------------+
|   0.469204617275163 |
|   0.469796281944551 |
|   0.470769223945374 |
|   0.472103901937297 |
|   0.473773369865649 |
|   0.475743682188048 |
|   0.477974405550919 |
...

The above translates the data into the 0..1 range. Next is to simply multiply by 61:

SELECT 61*(data - min_value)/(max_value - min_value) AS range_0_61 FROM chart_data, chart_data_minmax;
+-------------------+
| range_0_61        |
+-------------------+
|  28.6214816537849 |
|  28.6575731986176 |
|  28.7169226606678 |
|  28.7983380181751 |
|  28.9001755618046 |
|  29.0203646134709 |
|  29.1564387386061 |
|  29.3055730348683 |
|  29.4646269983728 |
|  29.6301923928522 |
...

To translate into a simple format character we parse a constant string:

SELECT
  SUBSTRING(
    'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789',
    1+61*(data - min_value)/(max_value - min_value),
    1
  ) AS c FROM chart_data, chart_data_minmax;
+------+
| c    |
+------+
| d    |
| d    |
| d    |
| d    |
| d    |
| d    |
| d    |
| d    |
| d    |
| e    |
| e    |
...

Oh, and we should handle NULLs, as well, so:

SELECT
  IF(
    data IS NULL,
    '_',
    SUBSTRING(
      'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789',
      1+61*(data - min_value)/(max_value - min_value),
      1
    )
  ) AS c FROM chart_data, chart_data_minmax;
+------+
| c    |
+------+
| d    |
| d    |
| d    |
| d    |
| d    |
| d    |
| d    |
| d    |
| d    |
| e    |
| e    |
...

And, to present it as a char sequence, we use the chd parameter and call GROUP_CONCAT to the rescue:

SELECT
  CONCAT('&chd=s:',
    GROUP_CONCAT(
      IF(
        data IS NULL,
        '_',
        SUBSTRING(
          'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789',
          1+61*(data - min_value)/(max_value - min_value),
          1
        )
      )
      SEPARATOR ''
    )
  ) AS c FROM chart_data, chart_data_minmax
\G
*************************** 1. row ***************************
c: &chd=s:dddddddddeeeeeefffffffffeeeedddcccbbaaZZZYYYXXXXXXXXXYYYZZabbcdeefghhijkkllmmmmmmmmllkkjihgfedcbZYXWVUTSRRQQPPPPQQQRSTUVWXZacdfgijlmnpqrssttuuuttssrqonmkigfdbZXVTSQONMLKJIIIIIIJKLMOPRTVXZbegilnprtvwyz01111110zyxvtrpnkifcaXUSPNLJHFECBBAAABBCEFHJLNQTWZcfilortwy1346789999876420yvspmjfcYVSOL

Wrap up

Combining the above two queries, with some URL necessities, we get:

SELECT
  CONCAT(
    'http://chart.apis.google.com/chart?cht=lc&chs=400x200&chtt=SQL%20chart&chxt=x,y&chxr=1,',
    ROUND(min_value, 1), ',',
    ROUND(max_value, 1),
    '&chd=s:',
    GROUP_CONCAT(
      IF(
        data IS NULL,
        '_',
        SUBSTRING(
          'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789',
          1+61*(data - min_value)/(max_value - min_value),
          1
        )
      )
      SEPARATOR ''
    )
  ) FROM chart_data, chart_data_minmax
;

And get:



http://chart.apis.google.com/chart?cht=lc&chs=400x200&chtt=SQL%20chart&chxt=x,y&chxr=1,-4716.6,5340.0&chd=s:dddddddddeeeeeefffffffffeeeedddcccbbaaZZZYYYXXXXXXXXXYYYZZabbcdeefghhijkkllmmmmmmmmllkkjihgfedcbZYXWVUTSRRQQPPPPQQQRSTUVWXZacdfgijlmnpqrssttuuuttssrqonmkigfdbZXVTSQONMLKJIIIIIIJKLMOPRTVXZbegilnprtvwyz01111110zyxvtrpnkifcaXUSPNLJHFECBBAAABBCEFHJLNQTWZcfilortwy1346789999876420yvspmjfcYVSOL

The above chart depicts our visualized data. Alas, x-axis ranges are wrong; no ticks nor grids are yet presented.

Granularity is not the best, either: the different 62 values are spread across a 200 pixel high chart, making for a 3 pixel distinction between two values, evident on the left area of the chart.

Continues on Generating Google line charts with SQL, part II.

tags: , ,
posted in MySQL by shlomi

« | »

Follow comments via the RSS Feed | Leave a comment | Trackback URL

3 Comments to "Generating Google line charts with SQL, part I"

  1. nadavkav wrote:

    Thank you :-)
    Very useful ! Exactly what i need. :-)

  2. Research by 1cainr - Pearltrees wrote:

    [...] Generating Google line charts with SQL, part I | code.openark.org In the above table we assume the timestamp values are evenly distributed. This is because of the aforementioned fact that Google Charts do not support time-series. So, if the given timnestamps are not evenly distributed, or maybe some values are skipped, we must manually compensate for that. In mycheckpoint I do that. [...]

  3. John Kaewsuksai wrote:

    Thanks Shlomi - so much useful, concise, easy to follow info.

Leave Your Comment

 
Powered by Wordpress and MySQL. Theme by openark.org