In this series of posts I wish to show how Google Charts can be generated via SQL. We discuss the Google Charts limitations which must be challenged, and work towards a simple chart.
I’m going to present the algorithm I use in mycheckpoint, a MySQL monitoring utility, which generates Google charts by raw data using views. An example of such chart follows:
http://chart.apis.google.com/chart?cht=lc&chs=370x180&chts=303030,12&chtt=Latest+24+hours:+Nov+9,+05:50++-++Nov+10,+05:50&chf=c,s,ffffff&chdl=Rentals+rate:+custom_1_psec&chdlp=b&chco=ff8c00&chd=s:GDGKGFLFGMJHRLMPPNULJRPLTOPRUMYPPVRNbQUSUSbSNWUOfSWTObVSUVWSVYVPbTPjfTbRTdXReUWhcTQRQZbTWYVYPaVZXdYYWPTabYUTbW99QLgLNIOIRNNMIKRJEHGFHGJGGFIFDFGDK&chxt=x,y&chxr=1,0,8.720000&chxl=0:|+||08:00||+||12:00||+||16:00||+||20:00||+||00:00||+||04:00||&chxs=0,505050,10,0,lt&chg=4.17,25,1,2,0.69,0&chxp=0,0.69,4.86,9.03,13.20,17.37,21.54,25.71,29.88,34.05,38.22,42.39,46.56,50.73,54.90,59.07,63.24,67.41,71.58,75.75,79.92,84.09,88.26,92.43,96.60&tsstart=2010-11-09+05:50:00&tsstep=600
mycheckpoint does not actually call on Google to do the chart rendering, but invokes its own JavaScript code to visualize the URL locally.
Here are some downsides for using Google charts:
- The URL cannot be as long as you like. 2048 characters is an upper bound you’ll want to keep behind. [Google charts POST method calls are available, which leads to 16K equivalent of URL length — this is still not too helpful due to the nature of POST calls]
- Features are inconsistent. To specify label or tick positions, one must specify exact positions. To specify grid positions, one must supply with step, offset, etc. There are more such inconsistencies.
- Google charts are not too friendly. Taking the ticks and grids example from above, there really shouldn’t be a reason why grids would not be automatically generated according to ticks definitions. But we are required to specify positions for the ticks as well as for the grids.
- There is no support for time-series. One must translate time as x-axis values.
- Perhaps most intimidating to many people: to generate a Google chart, once must send data to Google. Which is the main reason I used local JavaScript rendering.
Anyway, let’s build a very simple chart. Since I will not cover everything in this post, we make for some relaxed conditions.
Sample data
Consider the chart_data table, below:
SELECT * FROM chart_data; +-----+---------------------+-------------------+ | id | ts | data | +-----+---------------------+-------------------+ | 1 | 2011-02-01 09:59:15 | 1.99666833293656 | | 2 | 2011-02-01 09:59:15 | 7.94677323180245 | | 3 | 2011-02-01 09:59:15 | 17.7312123996804 | | 4 | 2011-02-01 09:59:15 | 31.153467384692 | | 5 | 2011-02-01 09:59:15 | 47.9425538604203 | | 6 | 2011-02-01 09:59:15 | 67.7570968074042 | | 7 | 2011-02-01 09:59:15 | 90.1904762132767 | | 8 | 2011-02-01 09:59:15 | 114.776974543924 | | 9 | 2011-02-01 09:59:15 | 140.998843732947 | | 10 | 2011-02-01 09:59:15 | 168.294196961579 | | 11 | 2011-02-01 09:59:15 | 196.065619213516 | | 12 | 2011-02-01 09:59:15 | 223.689380632134 | | 13 | 2011-02-01 09:59:15 | 250.52512820847 | ...
Find complete SQL script in google_charts.sql
In the above table we assume the timestamp values are evenly distributed. This is because of the aforementioned fact that Google Charts do not support time-series. So, if the given timnestamps are not evenly distributed, or maybe some values are skipped, we must manually compensate for that. In mycheckpoint I do that. In this blog post, we skip it.
Axis boundaries
By default, Google Charts expect data to lie within the range 0..100. We need to reset the scale to match our minimum/maximum values, and lay out our values within that range.
We will be using views in this solution, and here is the first view:
CREATE OR REPLACE VIEW chart_data_minmax AS SELECT MIN(data) AS min_value, MAX(data) AS max_value FROM chart_data ;
In our example we will just settle with the min/max values. In real worlds we may:
- Choose 0 to be the lower value, even if the minimal value is positive
- Choose to round min-max values to a nearest power of 10.
The axis boundaries go under the following parameters:
- chxt=x,y (we hard code bottom axis, left axis)
- chxr=1,min,max (with min, max as computed by the above)
The following query produces these params:
SELECT CONCAT('&chxt=x,y&chxr=1,', ROUND(min_value, 1), ',', ROUND(max_value, 1)) AS minmax FROM chart_data_minmax; +---------------------------------+ | minmax | +---------------------------------+ | &chxt=x,y&chxr=1,-4716.6,5340.0 | +---------------------------------+
Data values
Next, we have some 244 rows of data. Data is double. Can we be certain this all fits within 2048 characters? What if we had more rows?
Alas, we must make a trade-off. We will sacrifice accuracy in favor of a short URL. We will use simple encoding for our data. This means just one character per value, encoded A-Za-z0-9. This means just 62 distinct values.
So we must translate each data value into the range 0..61, and then into a character. First step is:
SELECT (data - min_value)/(max_value - min_value) AS range_01 FROM chart_data, chart_data_minmax; +---------------------+ | range_01 | +---------------------+ | 0.469204617275163 | | 0.469796281944551 | | 0.470769223945374 | | 0.472103901937297 | | 0.473773369865649 | | 0.475743682188048 | | 0.477974405550919 | ...
The above translates the data into the 0..1 range. Next is to simply multiply by 61:
SELECT 61*(data - min_value)/(max_value - min_value) AS range_0_61 FROM chart_data, chart_data_minmax; +-------------------+ | range_0_61 | +-------------------+ | 28.6214816537849 | | 28.6575731986176 | | 28.7169226606678 | | 28.7983380181751 | | 28.9001755618046 | | 29.0203646134709 | | 29.1564387386061 | | 29.3055730348683 | | 29.4646269983728 | | 29.6301923928522 | ...
To translate into a simple format character we parse a constant string:
SELECT SUBSTRING( 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789', 1+61*(data - min_value)/(max_value - min_value), 1 ) AS c FROM chart_data, chart_data_minmax; +------+ | c | +------+ | d | | d | | d | | d | | d | | d | | d | | d | | d | | e | | e | ...
Oh, and we should handle NULLs, as well, so:
SELECT IF( data IS NULL, '_', SUBSTRING( 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789', 1+61*(data - min_value)/(max_value - min_value), 1 ) ) AS c FROM chart_data, chart_data_minmax;+------+ | c | +------+ | d | | d | | d | | d | | d | | d | | d | | d | | d | | e | | e | ...
And, to present it as a char sequence, we use the chd parameter and call GROUP_CONCAT to the rescue:
SELECT CONCAT('&chd=s:', GROUP_CONCAT( IF( data IS NULL, '_', SUBSTRING( 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789', 1+61*(data - min_value)/(max_value - min_value), 1 ) ) SEPARATOR '' ) ) AS c FROM chart_data, chart_data_minmax \G *************************** 1. row *************************** c: &chd=s:dddddddddeeeeeefffffffffeeeedddcccbbaaZZZYYYXXXXXXXXXYYYZZabbcdeefghhijkkllmmmmmmmmllkkjihgfedcbZYXWVUTSRRQQPPPPQQQRSTUVWXZacdfgijlmnpqrssttuuuttssrqonmkigfdbZXVTSQONMLKJIIIIIIJKLMOPRTVXZbegilnprtvwyz01111110zyxvtrpnkifcaXUSPNLJHFECBBAAABBCEFHJLNQTWZcfilortwy1346789999876420yvspmjfcYVSOL
Wrap up
Combining the above two queries, with some URL necessities, we get:
SELECT CONCAT( 'http://chart.apis.google.com/chart?cht=lc&chs=400x200&chtt=SQL%20chart&chxt=x,y&chxr=1,', ROUND(min_value, 1), ',', ROUND(max_value, 1), '&chd=s:', GROUP_CONCAT( IF( data IS NULL, '_', SUBSTRING( 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789', 1+61*(data - min_value)/(max_value - min_value), 1 ) ) SEPARATOR '' ) ) FROM chart_data, chart_data_minmax ;
And get:
http://chart.apis.google.com/chart?cht=lc&chs=400x200&chtt=SQL%20chart&chxt=x,y&chxr=1,-4716.6,5340.0&chd=s:dddddddddeeeeeefffffffffeeeedddcccbbaaZZZYYYXXXXXXXXXYYYZZabbcdeefghhijkkllmmmmmmmmllkkjihgfedcbZYXWVUTSRRQQPPPPQQQRSTUVWXZacdfgijlmnpqrssttuuuttssrqonmkigfdbZXVTSQONMLKJIIIIIIJKLMOPRTVXZbegilnprtvwyz01111110zyxvtrpnkifcaXUSPNLJHFECBBAAABBCEFHJLNQTWZcfilortwy1346789999876420yvspmjfcYVSOL
The above chart depicts our visualized data. Alas, x-axis ranges are wrong; no ticks nor grids are yet presented.
Granularity is not the best, either: the different 62 values are spread across a 200 pixel high chart, making for a 3 pixel distinction between two values, evident on the left area of the chart.
Continues on Generating Google line charts with SQL, part II.
Thank you 🙂
Very useful ! Exactly what i need. 🙂
Thanks Shlomi – so much useful, concise, easy to follow info.