code.openark.org

I’ve seen some passwords to take a few years from my life.

I mean, we all know about dictionary words, right? And we’ve all seen Spaceballs, right? But choosing 12345 as your password is not the only careless option: there are many more! The more I get familiar with user’s password, the more I see how so much alike they all are.

Let’s review some of the commonly used bad password practices:

Empty passwords. Need I say more? Apparently yes. So what if “there’s only access through firewall from our company’s IP”?
Dictionary passwords: real English words like ‘falcon‘ or ‘tiger‘. Don’t use these! These are the easiest to attack.
Well known words: how about ‘Gandalf‘? It’s not dictionary, but it’s popular enough to appear in any respectable list. For that matter, look at how well filtered passwords are on RedHat: you can’t choose a password which is a common first or last name in the US, Italy, or even Israel; which is great!
Common substitues: enough with ‘1nsi9ht‘ and ‘@dm1n‘! These are almost as easy to break as dictionary words; it’s just a matter of a few more combinations per word.
Keyboard clustered: say No! to ‘1qa2ws‘. Don’t use ‘$rty&*io‘. They seems to be random at first sight, but look for them on the keyboard: it’s just your common “how shall I create a password that’s so easy to remember I will never forget it?”. Now REPLACE(“remember”, “break”) and REPLACE(“never forget”, “always regret”).
Children’s names, birth dates, 123456, your car’s license plate number, your Yahoo! mail password, etc. etc. etc.

There are many guidelines for choosing strong passwords. And everyone seems to know about it. But I’m still surprised when I find out the MySQL root password is ‘zxcvbn‘ or ‘pa55wd‘.

MySQL allows for any character in your password, so you may use punctuations, spaces, and other symbols. This is stronger than plain characters and digits.

Continue reading » “Passwords which are bad for your health”

New and noteworthy in mycheckpoint (rev. 57)

MySQLGraphs Monitoring mycheckpointDecember 16, 2009

Rev. 57 of mycheckpoint has been released and is available for download.

New and updated in this revision:

Remote host monitoring
Improved charting
Flexible charting
Fix to questions vs. queries issues

Remote host monitoring

It is now possible to monitor one host, while writing into another. Either or both could be remote hosts:

mycheckpoint --host=localhost --monitored-host=192.168.10.178

The above monitors the MySQL server on 192.168.10.178, and writes down to localhost (to be queried later)

mycheckpoint --monitored-host=127.0.0.1 --host=192.168.10.178

The above monitors the MySQL server on 127.0.0.1, and writes down to 192.168.10.178.

Continue reading » “New and noteworthy in mycheckpoint (rev. 57)”

In favour of a milestone based release model

MySQLOpinionsDecember 15, 2009December 15, 2009

I like milestone based release models.

The advantages I find in this model are in particular beneficial for MySQL. What I find good about this model are:

Things are unstable for shorter periods. Even if some feature is not full stable in some milestone, the model encourages that such a feature is fixed on higher priority.
It is easy to create a priority ranking for new features. Moreover, priorities are expressed more by chronological time of development, less by “how many people are working on it”.
The model pushes towards rapid development, since you can’t release M5 before M4 is complete.

The last versions of MySQL took long time to complete. Take 5.1, for example: partitioning and event scheduling were long considered GA before row-based replication was half stable. Consider the so small but useful sub-second slow logs; the variables made dynamic in 5.1 (slow log again, for example); the new INFORMATION_SCHEMA tables.

Continue reading » “In favour of a milestone based release model”

RPM builds for openark kit

MySQLopenark kitDecember 10, 2009

Thanks to Lenz Grimmer, openark kit is now available in RPM format.

.DEB or python packages are available, as usual, on project page on Google Code.

Thank you, Lenz!

Useful temporal functions & queries

MySQLData Types Indexing SQLDecember 8, 2009

Here’s a complication of some common and useful time & date calculations and equations. Some, though very simple, are often misunderstood, leading to inefficient or incorrect implementations.

There are many ways to solve such problems. I’ll present my favorites.

Querying for time difference

Given two timestamps: ts1 (older) and ts2 (newer), how much time has passed between them?

One can use TIMEDIFF() & DATEDIFF(), or compare two UNIX_TIMESTAMP() values. My personal favorite is to use TIMESTAMPDIFF(). Reason being that I’m usually interested in a specific metric, like the number of hours which have passed, or the number of days, disregarding the smaller minute/second resolution. Which allows one to:

SELECT TIMESTAMPDIFF(HOUR, ts1, ts2)

Take, for example:

SELECT TIMESTAMPDIFF(MONTH, '2008-10-07 00:00:00', '2009-12-06 00:00:00')

The function correctly identifies the number of days per month, and provides with 13, being the truncated number of full months.

Doing arithmetics

One can use TIMESTAMPADD(), or DATE_SUB(), but, again, when dealing with specific resolutions, I find “+ INTERVAL” to be the most convenient:

SELECT ts1 + INTERVAL 10 HOUR

Continue reading » “Useful temporal functions & queries”

On restoring a single table from mysqldump

MySQLBackup Books mysqldump Performance scriptsDecember 1, 2009December 16, 2009

Following Restore one table from an ALL database dump and Restore a Single Table From mysqldump, I would like to add my own thoughts and comments on the subject.

I also wish to note performance issues with the two suggested solutions, and offer improvements.

Problem relevance

While the problem is interesting, I just want to note that it is relevant in very specific database dimensions. Too small – and it doesn’t matter how you solve it (e.g. just open vi/emacs and copy+paste). Too big – and it would not be worthwhile to restore from mysqldump anyway. I would suggest that the problem is interesting in the whereabouts of a few dozen GB worth of data.

Problem recap

Given a dump file (generated by mysqldump), how do you restore a single table, without making any changes to other tables?

Let’s review the two referenced solutions. I’ll be using the employees db on mysql-sandbox for testing. I’ll choose a very small table to restore: departments (only a few rows in this table).

Security based solution

Chris offers to create a special purpose account, which will only have write (CREATE, INSERT, etc.) privileges on the particular table to restore. Cool hack! But, I’m afraid, not too efficient, for two reasons: Continue reading » “On restoring a single table from mysqldump”

questions or queries?

MySQLMonitoring mycheckpointNovember 13, 2009

I’ve hit a recent change which took me by surprise.

I was used to checking for the ‘questions‘ global status variables to see the total amount of queries the server performs. So, for example, I could run com_select/questions to learn the SELECT ratio out of all queries.

Apparently, as of 5.0.72–5.0.76 & 5.1.31 this has changed. A new status variable was introduced, called ‘queries‘.

The change being? questions does not any longer indicate the number of queries the server has executed: only the number of queries requested by the client (so, calling on a stored routine only counts as 1, regardless of how many queries the routine executes). The new queries variable indicates the amount of server queries issued (see the 5.0 and 5.1 docs for details).

So, as of 5.0.72 or 5.1.31, the calculation should be com_select/com_queries (or com_select_diff/com_queries_diff) to learn the SELECT ratio of all queries. I learned this due to a bug report on mycheckpoint, which presented some 10265% SELECT ratio…

My take on this is that it could have been worked out differently: instead of changing the meaning of an existing variable, questions could have remained as it was, with the introduction of, say, client_questions, which would only indicate client number of issued queries.

I believe changing the meaning of status variables at such late versions (5.0.76 is quite late!) invites trouble: code that used to work on already then-stable versions (e.g. 5.0.51) would behave differently after upgrade. Such changes should best take place while still in BETA phase.

Performance analysis with mycheckpoint

MySQLAnalysis InnoDB Monitoring mycheckpoint PerformanceNovember 12, 2009December 16, 2009

mycheckpoint (see announcement) allows for both graph presentation and quick SQL access to monitored & analyzed data. I’d like to show the power of combining them both.

InnoDB performance

Taking a look at one of the most important InnoDB metrics: the read hit ratio (we could get the same graph by looking at the HTML report):

SELECT innodb_read_hit_percent FROM sv_report_chart_sample \G
*************************** 1. row ***************************
innodb_read_hit_percent: http://chart.apis.google.com/chart?cht=lc&chs=400x200&chts=303030,12&chtt=Nov+10,+11:40++-++Nov+11,+08:55+(0+days,+21+hours)&chdl=innodb_read_hit_percent&chdlp=b&chco=ff8c00&chd=s:400664366P6674y7176677677u467773y64ux166666764366646y616666666666644444434444s6u4S331444404433341334433646777666666074736777r1777767764776666F667777617777777777777777yaRi776776mlf667676xgx776766rou67767777u37797777x76676776u6A737464y67467761777666643u66446&chxt=x,y&chxr=1,99.60,100.00&chxl=0:||Nov+10,+15:55|Nov+10,+20:10|Nov+11,+00:25|Nov+11,+04:40|&chxs=0,505050,10

We see that read hit is usually high, but occasionally drops low, down to 99.7, or even 99.6. But it seems like most of the time we are above 99.95% read hit ratio. It’s hard to tell about 99.98%.

Can we know for sure?

We can stress our eyes, yet be certain of little. It’s best if we just query for the metrics! mycheckpoint provides with all data, accessible by simple SQL queries: Continue reading » “Performance analysis with mycheckpoint”

Replication analysis with mycheckpoint

MySQLAnalysis Monitoring mycheckpoint ReplicationNovember 11, 2009November 12, 2009

I would like to show how mycehckpoint (see announcement) can be put to use for analyzing various replication metrics.

Lagging slaves

A slave has been monitored. Monitoring started at a time when it was way behind master (about two days lag), but it has since caught up. This can be easily verified by the following chart:

The above chart can be obtained by viewing the HTML report:

SELECT html FROM sv_report_html

Or by directly issuing the query:

mysql> SELECT seconds_behind_master FROM sv_report_chart_hour\G
*************************** 1. row ***************************
seconds_behind_master: http://chart.apis.google.com/chart?cht=lc&chs=400x200&chts=303030,12&chtt=Nov+5,+10:00++-++Nov+10,+08:00+(4+days,+22+hours)&chdl=seconds_behind_master&chdlp=b&chco=ff8c00&chd=s:976431zzzywutrpnliiifdbZYXVTRRRPNLJHEBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA&chxt=x,y&chxr=1,0,169811&chxl=0:||Nov+6,+09:00|Nov+7,+09:00|Nov+8,+08:00|Nov+9,+08:00|&chxs=0,505050,10

This is all nice. But I’m also interested in the rate at which slave lag decreased. Many ignore this important metric: just how fast does your slave replicate?