Useful temporal functions & queries

Here’s a complication of some common and useful time & date calculations and equations. Some, though very simple, are often misunderstood, leading to inefficient or incorrect implementations.

There are many ways to solve such problems. I’ll present my favorites.

Querying for time difference

Given two timestamps: ts1 (older) and ts2 (newer), how much time has passed between them?

One can use TIMEDIFF() & DATEDIFF(), or compare two UNIX_TIMESTAMP() values. My personal favorite is to use TIMESTAMPDIFF(). Reason being that I’m usually interested in a specific metric, like the number of hours which have passed, or the number of days, disregarding the smaller minute/second resolution. Which allows one to:

SELECT TIMESTAMPDIFF(HOUR, ts1, ts2)

Take, for example:

SELECT TIMESTAMPDIFF(MONTH, '2008-10-07 00:00:00', '2009-12-06 00:00:00')

The function correctly identifies the number of days per month, and provides with 13, being the truncated number of full months.

Doing arithmetics

One can use TIMESTAMPADD(), or DATE_SUB(), but, again, when dealing with specific resolutions, I find “+ INTERVAL” to be the most convenient:

SELECT ts1 + INTERVAL 10 HOUR

Continue reading » “Useful temporal functions & queries”

SQL multi line chart

Time for another charting SQL query. I wish to present a single-query generated multi-line/area chart. I’ll walk through some of the steps towards making this happen. By the end of this post I’ll present some real-data charts, area charts and colored charts.

+---------+-----------------------------------------------------------------------------------+
| y_scale | chart                                                                             |
+---------+-----------------------------------------------------------------------------------+
| 1       | ****---------#######----------------------------------------*******--------###### |
| 0.88    | ----***---###-------##-----------------------------------***-------***---##------ |
| 0.75    | -------**#------------##-------------------------------**-------------*##-------- |
| 0.63    | ------##-*--------------##----------------------------*--------------##**-------- |
| 0.5     | -@@@@@@@@@@@@@@@----------#-----@@@@@@@@@@@@@@@@----**---------@@@@@@@@@@@@@@@@-- |
| 0.38    | ----#-------*--------------##----------------------*---------------#-------*----- |
| 0.25    | --##---------*---------------#--------------------*--------------##---------*---- |
| 0.13    | -#------------**--------------#-----------------**--------------#------------*--- |
| 0       | @---------------*--------------##--------------*---------------#--------------**- |
| -0.12   | -----------------*---------------#------------*--------------##-----------------* |
| -0.25   | ------------------*---------------#---------**--------------#-------------------- |
| -0.37   | -------------------**--------------#-------*---------------#--------------------- |
| -0.5    | ----------------@@@@@@@@@@@@@@@@----##----*-----@@@@@@@@@@@@@@@----------------@@ |
| -0.62   | ----------------------**--------------#-**--------------#------------------------ |
| -0.75   | ------------------------**------------**#-------------##------------------------- |
| -0.87   | --------------------------**-------***---###-------###--------------------------- |
| -1      | ----------------------------*******---------#######------------------------------ |
|         | v:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::v |
|         | 0                                                                               8 |
|         |     # sin(x)                                                                      |
|         |     * cos(x)                                                                      |
|         |     @ sign(tan(x))/2                                                              |
+---------+-----------------------------------------------------------------------------------+

Continue reading » “SQL multi line chart”

How to calculate a good InnoDB log file size – recap

Following Baron Schwartz’ post: How to calculate a good InnoDB log file size, which shows how to make an estimate for the InnoDB log file size, and based on SQL: querying for status difference over time, I’ve written a query to run on MySQL 5.1, which, upon sampling 60 seconds of status, estimates the InnoDB transaction log bytes that are expected to be written in the period of 1 hour.

Recap: this information can be useful if you’re looking for a good innodb_log_file_size value, such that will not pose too much I/O (smaller values will make for more frequent flushes), not will make for a too long recovery time (larger values mean more transactions to recover upon crash).

It is assumed that the 60 seconds period represents an average system load, not some activity spike period. Edit the sleep time and factors as you will to sample longer or shorter periods. Continue reading » “How to calculate a good InnoDB log file size – recap”

SQL: querying for status difference over time

The InnoDB plugin has a nice INFORMATION_SCHEMA concept: resetting tables. For example, the INNODB_CMP table lists information about compression operation. A similar table, INNODB_CMP_RESET, provides the same information, but resets the values. The latter can be used to measure, for example, number of compression operations over time.

I wish to present a SQL trick which does the same, without need for resetting tables. Suppose you have some status table, and you wish to measure the change in status per second, per minute etc. The trick is to query for the value twice in the same query, with some pause in between, and make the difference calculation.

For sake of simplicity, I’ll demonstrate using 5.1’s INFORMATION_SCHEMA.GLOBAL_STATUS. Please refer to INFORMATION_SCHEMA.GLOBAL_STATUS: watch out for some discussion on this.

Continue reading » “SQL: querying for status difference over time”

SQL: ranking without self join, revisited

This post follows SQL: Ranking without self join and On user variables evaluation order. I wish to share some insights with regard to user variables evaluation, as well as provide yet another ranking solution, which attempts to overcome the uncertainty factor with user variables.

There will be hand waving in this post (albeit empirical hand waving). Stop here if you don’t like hand waving. Continue if you feel curious or wish to contradict my assumptions.

Recap

The order of evaluation of user variables is undefined. The documentation has some contradicting example (bug 47514), but states that variables should not be read and assigned in different parts of the same statement (just what is a different part? Bug 47516).

Looking for a solution

There doesn’t seem to be a problem with reading and assigning variables in the very same part of the statement. For example, SELECT @a := @a+1… is exactly such a case.

Continue reading » “SQL: ranking without self join, revisited”

SQL: Ranking without self join

The common way of solving the classic SQL problem of ranking, involves a  self join. I wish to present a different solution, which only iterates the table once, and provides the same output.

The ranking problem

Given a table with names and scores (e.g. students exams scores), add rank for each row, such that the rank identifies her position among other rows. Rows with identical scores should receive the same rank (e.g. both contenders got the silver medal).

Consider the following table (download score.sql):

mysql> select * from score;
+----------+--------------+-------+
| score_id | student_name | score |
+----------+--------------+-------+
|        1 | Wallace      |    95 |
|        2 | Gromit       |    97 |
|        3 | Shaun        |    85 |
|        4 | McGraw       |    92 |
|        5 | Preston      |    92 |
+----------+--------------+-------+
5 rows in set (0.00 sec)

We wish to present ranks in some way similar to:

+----------+--------------+-------+------+
| score_id | student_name | score | rank |
+----------+--------------+-------+------+
|        2 | Gromit       |    97 |    1 |
|        1 | Wallace      |    95 |    2 |
|        4 | McGraw       |    92 |    3 |
|        5 | Preston      |    92 |    3 |
|        3 | Shaun        |    85 |    4 |
+----------+--------------+-------+------+

Continue reading » “SQL: Ranking without self join”

Generating numbers out of seemingly thin air

In some of my previous posts I’ve used a numbers table, like one holding values 1, 2, 3, …, 255. Such table can be used for string walking, joining with other tables, performing iterations.

The existence of number tables has always been a little pain. Yes, they’re very, very simple, but they need to be there. So if you just need to script some SQL query, you may find that you need to create such tables. Ummm… this means you need to have privileges (at least CREATE TEMPORARY and INSERT, if not CREATE).

The other day, Baron Schwartz posted How to round to the nearest whole multiple or fraction in SQL. In an offhand way, he generated some random numbers using the mysql.help_topic table. I then realized that post solved something I’ve been looking for: using a sure-to-exist table on any MySQL installation.

Continue reading » “Generating numbers out of seemingly thin air”

SQL pie chart

My other half says I’m losing it. But I think that as an enthusiast kernel developer she doesn’t have the right to criticize people. (“I like user space better!” – she exclaims upon reading this).

Shown below is a (single query) SQL-generated pie chart. I will walk through the steps towards making this happen, and conclude with what, I hope you’ll agree, are real-world, useful usage samples.

+----------------------------------------------------------------------+
| pie_chart                                                            |
+----------------------------------------------------------------------+
|                                                                      |
|                         ;;;;;;;;;;;;;;;;;;;;;                        |
|                  oooooooo;;;;;;;;;;;;;;;;;;;;;;;;;;;                 |
|             oooooooooooooo;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;            |
|          ooooooooooooooooo                 ;;;;;;;;;;;;#####         |
|        oooooooooooooo                           ;#############       |
|       oooooooooooo                                 ############      |
|      oooooooooooo                                   ############     |
|      ooooooooooo                                     ###########     |
|      oooooooooooo                                   ::::::::::::     |
|       oooooooooooo                                 ::::::::::::      |
|        ooooooooo:::::                           ::::::::::::::       |
|          o::::::::::::::::                 :::::::::::::::::         |
|             :::::::::::::::::::::::::::::::::::::::::::::            |
|                  :::::::::::::::::::::::::::::::::::                 |
|                         :::::::::::::::::::::                        |
|                                                                      |
| ##  red: 1 (10%)                                                     |
| ;;  blue: 2 (20%)                                                    |
| oo  orange: 3 (30%)                                                  |
| ::  white: 4 (40%)                                                   |
+----------------------------------------------------------------------+

Requirements

We need a generic query, which returns at least these two columns: name_column and value_column. For example, the following query will do: Continue reading » “SQL pie chart”

SQL graphics

SQL is not meant to generate graphics, for sure; but I see some cases where generating non-tabular output can be desirable, as I will show in future posts.

I’d like to explain the basics of working SQL graphics: it is actually possible to do whatever you like. How?

Coordinates system

We’ll now develop a coordinates system using SQL. By producing this, I will have proven my point that anything is possible, and will provide an additional proof of concept.

To start with generating coordinates, I’ll need a helper table: a numbers table (tinyint_asc, example, with numbers ranging 0..255).

We’ll strive to produce a 10×10 coordinate matrix. To do this, we’ll self-join the numbers table against itself, and use a helper variable to set the size of the matrix. Continue reading » “SQL graphics”

Auto scaling, scaled SQL graphs concluded

I wasn’t sure I was to go this far. After catching breath the following have been added to Generic, auto scaling, scaled SQL graphs, and these will conclude my current hacks:

  • Displaying X-axis min/max values.
  • Support for Y-axis values precision.
  • Support for pre-defined scale range.

The addition of the above makes for presentable, usable graphs. See also sample graphs at the end of this post.

Step 8: adding X-axis values

I add minimum/maximum X-scale values to the graph. What was just ordering_column before, now turns to be the x in the y = f(x) function. Continue reading » “Auto scaling, scaled SQL graphs concluded”