Useful temporal functions & queries

Here’s a complication of some common and useful time & date calculations and equations. Some, though very simple, are often misunderstood, leading to inefficient or incorrect implementations.

There are many ways to solve such problems. I’ll present my favorites.

Querying for time difference

Given two timestamps: ts1 (older) and ts2 (newer), how much time has passed between them?

One can use TIMEDIFF() & DATEDIFF(), or compare two UNIX_TIMESTAMP() values. My personal favorite is to use TIMESTAMPDIFF(). Reason being that I’m usually interested in a specific metric, like the number of hours which have passed, or the number of days, disregarding the smaller minute/second resolution. Which allows one to:

SELECT TIMESTAMPDIFF(HOUR, ts1, ts2)

Take, for example:

SELECT TIMESTAMPDIFF(MONTH, '2008-10-07 00:00:00', '2009-12-06 00:00:00')

The function correctly identifies the number of days per month, and provides with 13, being the truncated number of full months.

Doing arithmetics

One can use TIMESTAMPADD(), or DATE_SUB(), but, again, when dealing with specific resolutions, I find “+ INTERVAL” to be the most convenient:

SELECT ts1 + INTERVAL 10 HOUR

Continue reading » “Useful temporal functions & queries”

On restoring a single table from mysqldump

Following Restore one table from an ALL database dump and Restore a Single Table From mysqldump, I would like to add my own thoughts and comments on the subject.

I also wish to note performance issues with the two suggested solutions, and offer improvements.

Problem relevance

While the problem is interesting, I just want to note that it is relevant in very specific database dimensions. Too small – and it doesn’t matter how you solve it (e.g. just open vi/emacs and copy+paste). Too big – and it would not be worthwhile to restore from mysqldump anyway. I would suggest that the problem is interesting in the whereabouts of a few dozen GB worth of data.

Problem recap

Given a dump file (generated by mysqldump), how do you restore a single table, without making any changes to other tables?

Let’s review the two referenced solutions. I’ll be using the employees db on mysql-sandbox for testing. I’ll choose a very small table to restore: departments (only a few rows in this table).

Security based solution

Chris offers to create a special purpose account, which will only have write (CREATE, INSERT, etc.) privileges on the particular table to restore. Cool hack! But, I’m afraid, not too efficient, for two reasons: Continue reading » “On restoring a single table from mysqldump”

questions or queries?

I’ve hit a recent change which took me by surprise.

I was used to checking for the ‘questions‘ global status variables to see the total amount of queries the server performs. So, for example, I could run com_select/questions to learn the SELECT ratio out of all queries.

Apparently, as of 5.0.725.0.76 & 5.1.31 this has changed. A new status variable was introduced, called ‘queries‘.

The change being? questions does not any longer indicate the number of queries the server has executed: only the number of queries requested by the client (so, calling on a stored routine only counts as 1, regardless of how many queries the routine executes). The new queries variable indicates the amount of server queries issued (see the 5.0 and 5.1 docs for details).

So, as of 5.0.72 or 5.1.31, the calculation should be com_select/com_queries (or com_select_diff/com_queries_diff) to learn the SELECT ratio of all queries. I learned this due to a bug report on mycheckpoint, which presented some 10265% SELECT ratio…

My take on this is that it could have been worked out differently: instead of changing the meaning of an existing variable, questions could have remained as it was, with the introduction of, say, client_questions, which would only indicate client number of issued queries.

I believe changing the meaning of status variables at such late versions (5.0.76 is quite late!) invites trouble: code that used to work on already then-stable versions (e.g. 5.0.51) would behave differently after upgrade. Such changes should best take place while still in BETA phase.

Performance analysis with mycheckpoint

mycheckpoint (see announcement) allows for both graph presentation and quick SQL access to monitored & analyzed data. I’d like to show the power of combining them both.

InnoDB performance

Taking a look at one of the most important InnoDB metrics: the read hit ratio (we could get the same graph by looking at the HTML report):

SELECT innodb_read_hit_percent FROM sv_report_chart_sample \G
*************************** 1. row ***************************
innodb_read_hit_percent: http://chart.apis.google.com/chart?cht=lc&chs=400x200&chts=303030,12&chtt=Nov+10,+11:40++-++Nov+11,+08:55+(0+days,+21+hours)&chdl=innodb_read_hit_percent&chdlp=b&chco=ff8c00&chd=s:400664366P6674y7176677677u467773y64ux166666764366646y616666666666644444434444s6u4S331444404433341334433646777666666074736777r1777767764776666F667777617777777777777777yaRi776776mlf667676xgx776766rou67767777u37797777x76676776u6A737464y67467761777666643u66446&chxt=x,y&chxr=1,99.60,100.00&chxl=0:||Nov+10,+15:55|Nov+10,+20:10|Nov+11,+00:25|Nov+11,+04:40|&chxs=0,505050,10

We see that read hit is usually high, but occasionally drops low, down to 99.7, or even 99.6. But it seems like most of the time we are above 99.95% read hit ratio. It’s hard to tell about 99.98%.

Can we know for sure?

We can stress our eyes, yet be certain of little. It’s best if we just query for the metrics! mycheckpoint provides with all data, accessible by simple SQL queries: Continue reading » “Performance analysis with mycheckpoint”

Replication analysis with mycheckpoint

I would like to show how mycehckpoint (see announcement) can be put to use for analyzing various replication metrics.

Lagging slaves

A slave has been monitored. Monitoring started at a time when it was way behind master (about two days lag), but it has since caught up. This can be easily verified by the following chart:

The above chart can be obtained by viewing the HTML report:

SELECT html FROM sv_report_html

Or by directly issuing the query:

mysql> SELECT seconds_behind_master FROM sv_report_chart_hour\G
*************************** 1. row ***************************
seconds_behind_master: http://chart.apis.google.com/chart?cht=lc&chs=400x200&chts=303030,12&chtt=Nov+5,+10:00++-++Nov+10,+08:00+(4+days,+22+hours)&chdl=seconds_behind_master&chdlp=b&chco=ff8c00&chd=s:976431zzzywutrpnliiifdbZYXVTRRRPNLJHEBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA&chxt=x,y&chxr=1,0,169811&chxl=0:||Nov+6,+09:00|Nov+7,+09:00|Nov+8,+08:00|Nov+9,+08:00|&chxs=0,505050,10

This is all nice. But I’m also interested in the rate at which slave lag decreased. Many ignore this important metric: just how fast does your slave replicate?

Continue reading » “Replication analysis with mycheckpoint”

Announcing mycheckpoint: lightweight, SQL oriented monitoring for MySQL

I’m proud to announce mycheckpoint, a monitoring utility for MySQL, with strong emphasis on user accessibility to monitored data.

mycheckpoint is a different kind of monitoring tool. It leaves the power in the user’s hand. It’s power is not with script-based calculations of recorded data. It’s with the creation of a view hierarchy, which allows the user to access computed metrics directly.

mycheckpoint is needed first, to deploy a monitoring schema. It may be needed next, so as to INSERT recorded data (GLOBAL STATUS, GLOBAL VARIABLES, MASTER STATUS, SLAVE STATUS) — but this is just a simple INSERT; anyone can do that, even another monitoring tool.

It is then that you do not need it anymore: everything is laid at your fingertips. Consider:

SELECT innodb_read_hit_percent, DML FROM sv_report_chart_hour;

mycheckpoint provides the views which take raw data (just innodb_buffer_pool_read_requests, com_select, innodb_buffer_pool_size, table_open_cache, seconds_behind_master etc.) and generate Google Charts URLs, HTML reports, human readable reports, or otherwise easily accessible data.

Continue reading » “Announcing mycheckpoint: lightweight, SQL oriented monitoring for MySQL”

SQL multi line chart

Time for another charting SQL query. I wish to present a single-query generated multi-line/area chart. I’ll walk through some of the steps towards making this happen. By the end of this post I’ll present some real-data charts, area charts and colored charts.

+---------+-----------------------------------------------------------------------------------+
| y_scale | chart                                                                             |
+---------+-----------------------------------------------------------------------------------+
| 1       | ****---------#######----------------------------------------*******--------###### |
| 0.88    | ----***---###-------##-----------------------------------***-------***---##------ |
| 0.75    | -------**#------------##-------------------------------**-------------*##-------- |
| 0.63    | ------##-*--------------##----------------------------*--------------##**-------- |
| 0.5     | -@@@@@@@@@@@@@@@----------#-----@@@@@@@@@@@@@@@@----**---------@@@@@@@@@@@@@@@@-- |
| 0.38    | ----#-------*--------------##----------------------*---------------#-------*----- |
| 0.25    | --##---------*---------------#--------------------*--------------##---------*---- |
| 0.13    | -#------------**--------------#-----------------**--------------#------------*--- |
| 0       | @---------------*--------------##--------------*---------------#--------------**- |
| -0.12   | -----------------*---------------#------------*--------------##-----------------* |
| -0.25   | ------------------*---------------#---------**--------------#-------------------- |
| -0.37   | -------------------**--------------#-------*---------------#--------------------- |
| -0.5    | ----------------@@@@@@@@@@@@@@@@----##----*-----@@@@@@@@@@@@@@@----------------@@ |
| -0.62   | ----------------------**--------------#-**--------------#------------------------ |
| -0.75   | ------------------------**------------**#-------------##------------------------- |
| -0.87   | --------------------------**-------***---###-------###--------------------------- |
| -1      | ----------------------------*******---------#######------------------------------ |
|         | v:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::v |
|         | 0                                                                               8 |
|         |     # sin(x)                                                                      |
|         |     * cos(x)                                                                      |
|         |     @ sign(tan(x))/2                                                              |
+---------+-----------------------------------------------------------------------------------+

Continue reading » “SQL multi line chart”

On free and open blog posts: this post is free

I’d like to criticize and stress some opinions on free and open blog posts, including such appearing on planet MySQL.

Some rants follow; skip if you’re only after technical stuff

1. On free reading

This post is completely free. You don’t need to pay or register to read it, not will you require to pass personal details to comment.

If you happen to see this post on some website, which only provides you with 1st couple of sentences, then asks you to login in order to view the complete text — you should know you need no registration to read this.

Continue reading » “On free and open blog posts: this post is free”

How to calculate a good InnoDB log file size – recap

Following Baron Schwartz’ post: How to calculate a good InnoDB log file size, which shows how to make an estimate for the InnoDB log file size, and based on SQL: querying for status difference over time, I’ve written a query to run on MySQL 5.1, which, upon sampling 60 seconds of status, estimates the InnoDB transaction log bytes that are expected to be written in the period of 1 hour.

Recap: this information can be useful if you’re looking for a good innodb_log_file_size value, such that will not pose too much I/O (smaller values will make for more frequent flushes), not will make for a too long recovery time (larger values mean more transactions to recover upon crash).

It is assumed that the 60 seconds period represents an average system load, not some activity spike period. Edit the sleep time and factors as you will to sample longer or shorter periods. Continue reading » “How to calculate a good InnoDB log file size – recap”

SQL: querying for status difference over time

The InnoDB plugin has a nice INFORMATION_SCHEMA concept: resetting tables. For example, the INNODB_CMP table lists information about compression operation. A similar table, INNODB_CMP_RESET, provides the same information, but resets the values. The latter can be used to measure, for example, number of compression operations over time.

I wish to present a SQL trick which does the same, without need for resetting tables. Suppose you have some status table, and you wish to measure the change in status per second, per minute etc. The trick is to query for the value twice in the same query, with some pause in between, and make the difference calculation.

For sake of simplicity, I’ll demonstrate using 5.1’s INFORMATION_SCHEMA.GLOBAL_STATUS. Please refer to INFORMATION_SCHEMA.GLOBAL_STATUS: watch out for some discussion on this.

Continue reading » “SQL: querying for status difference over time”