Monotonic functions, SQL and MySQL

In mathematics, a monotonic function (or monotone function) is a function which preserves the given order. [Wikipedia]

To be more precise, a function f is monotonic increasing, if for every x ≤ y it holds that f(x) ≤ f(y). f is said to be strictly monotonic increasing is for every x < y it holds that f(x) < f(y).

So, if we follow values in some order, we say that f is monotonic increasing if f‘s value never decreases (it either increases or stays the same), and we say that f is strictly increasing if f‘s value is always changes “upwards”.

Monotonic functions play an important role in SQL. To discuss monotonic functions in SQL we must first determine what the order is, and then, what the function is.

Well, they both change according to our point of view. Let’s look at some examples. Take a look at the following table: Continue reading » “Monotonic functions, SQL and MySQL”

Beware of implicit casting

Ever so often a query provides a “bad” execution plan. Adding a missing index can many times solve the problem. However, not everything can be solved with an index. I wish to highlight the point of having an implicit cast, which negates the use of an index on MySQL.

I see this happening a lot on customers’ databases, and this begs for a short introduction.

MySQL doesn’t support index functions

Let’s assume the following table: Continue reading » “Beware of implicit casting”

Checking for string permutation

A permutation is a change of places. Thus, ‘lolhe’ is a permuted ‘hello’ (commonly referred to as ‘scrambled text’).

I wish to present an SQL solution for checking if two strings are permutations of the same text.

About permutations

So, if ‘lolhe’ is a permutation of ‘hello’, then ‘hello’ is a permutation of ‘lolhe’, as well; and both are permutations of ‘elloh’. The REVERSE() of a text is an example of permutation. Mathematically, string permutation is an equivalence relation, and divides all strings to equivalence classes.

Use cases

  • We may be interested in permutations when a user chooses a password. We may disallow a password which is identical to the login name; but we may also disallow upper-lower-case-only transformations of the text. We may still disallow a permutation of the text.
  • On a slightly different scale, the two queries: SELECT * FROM City WHERE id IN (5, 21, 13) and SELECT * FROM City WHERE id IN (13, 5, 21) are identical. Here, the permutation is not with string characters, but with string tokens. While the solution discussed is targeted at string characters, it can be easily converted to work with string tokens.

Checking for permutation

The solution I’m suggesting checks for permutation between 2 strings by permuting both to a third, normal form. The two string are permutations of each other if both have the same normal form.

Continue reading » “Checking for string permutation”

Misimproving performance problems with INSERT DELAYED

INSERT DELAYED may come in handy when using MyISAM tables. It may in particular be useful for log tables, where one is required to issue frequent INSERTs on one hand, but does not usually want or need to wait for DB response on the other hand.

It may even offer some performance boost, by aggregating such frequent INSERTs in a single thread.

But it is NOT a performance solution.

That is, in a case I’ve seen, database performance was poor. INSERTs were taking a very long time. Lot’s of locks were involved. The solution offered was to change all slow INSERTs to INSERT DELAYED. Voila! All INSERT queries now completed in no time.

But the database performance remained poor. Just as poor as before, with the additional headache: nobody knew what caused the low performance.

Using INSERT DELAYED to improve overall INSERT performance is like sweeping the dust under the carpet. It’s still there, only you can’t actually see it. When your queries are slow to return, you know which queries or which parts of your application are the immediate suspects. When everything happens in the background you lose that feeling.

The slow query log, fortunately, still provides with the necessary information, and all the other metrics are just as before. Good. But it now takes a deeper level of analysis to find a problem that was previously in plain sight.

So: use INSERT DELAYED carefully, don’t just throw it at your slow queries like a magic potion.

mycheckpoint rev. 76: OS monitoring, auto deploy, brief HTML and 24/7 reports

Revision 76 of mycheckpoint comes with quite a few improvements, including:

  • OS monitoring (CPU, load average, memory)
  • Auto-deploy
  • Improved charting
  • Brief HTML reports
  • 24/7 charts

OS Monitoring

When monitoring the local machine, mycheckpoint now monitors CPU utilization, load average, memory and swap space.

This only applies to the Linux operating system; there is currently no plan to work this out for other operating systems.

Examples:

mysql> SELECT os_cpu_utilization_percent FROM sv_report_chart_sample;

mycheckpoint-chart-cpu-sample
mysql> SELECT ts, os_loadavg FROM mycheckpoint.sv_report_sample;
+---------------------+------------+
| 2009-12-27 11:45:01 |       1.78 |
| 2009-12-27 11:50:01 |       2.48 |
| 2009-12-27 11:55:01 |       2.35 |
...
+---------------------+------------+
mysql> SELECT report FROM mycheckpoint.sv_report_human_sample ORDER BY id DESC LIMIT 1 \G
*************************** 1. row ***************************
report:
Report period: 2009-12-27 13:20:01 to 2009-12-27 13:25:01. Period is 5 minutes (0.08 hours)
Uptime: 100.0% (Up: 334 days, 06:37:28 hours)

OS:
 Load average: 1.67
 CPU utilization: 25.2%
 Memory: 7486.4MB used out of 7985.6484MB (Active: 6685.8906MB)
 Swap: 3835.2MB used out of 8189.3750MB
...

Auto-deploy

mycheckpoint now has a version recognition mechanism. There is no need to call mycheckpoint with the “deploy” argument on first install or after upgrade. mycheckpoint will recognize a change of version and will auto-deploy before moving on to monitoring your system.

Continue reading » “mycheckpoint rev. 76: OS monitoring, auto deploy, brief HTML and 24/7 reports”

Passwords which are bad for your health

I’ve seen some passwords to take a few years from my life.

I mean, we all know about dictionary words, right? And we’ve all seen Spaceballs, right? But choosing 12345 as your password is not the only careless option: there are many more! The more I get familiar with user’s password, the more I see how so much alike they all are.

Let’s review some of the commonly used bad password practices:

  • Empty passwords. Need I say more? Apparently yes. So what if “there’s only access through firewall from our company’s IP”?
  • Dictionary passwords: real English words like ‘falcon‘ or ‘tiger‘. Don’t use these! These are the easiest to attack.
  • Well known words: how about ‘Gandalf‘? It’s not dictionary, but it’s popular enough to appear in any respectable list. For that matter, look at how well filtered passwords are on RedHat: you can’t choose a password which is a common first or last name in the US, Italy, or even Israel; which is great!
  • Common substitues: enough with ‘1nsi9ht‘ and ‘@dm1n‘! These are almost as easy to break as dictionary words; it’s just a matter of a few more combinations per word.
  • Keyboard clustered: say No! to ‘1qa2ws‘. Don’t use ‘$rty&*io‘. They seems to be random at first sight, but look for them on the keyboard: it’s just your common “how shall I create a password that’s so easy to remember I will never forget it?”. Now REPLACE(“remember”, “break”) and REPLACE(“never forget”, “always regret”).
  • Children’s names, birth dates, 123456, your car’s license plate number, your Yahoo! mail password, etc. etc. etc.

There are many guidelines for choosing strong passwords. And everyone seems to know about it. But I’m still surprised when I find out the MySQL root password is ‘zxcvbn‘ or ‘pa55wd‘.

MySQL allows for any character in your password, so you may use punctuations, spaces, and other symbols. This is stronger than plain characters and digits.

Continue reading » “Passwords which are bad for your health”

New and noteworthy in mycheckpoint (rev. 57)

Rev. 57 of mycheckpoint has been released and is available for download.

New and updated in this revision:

Remote host monitoring

It is now possible to monitor one host, while writing into another. Either or both could be remote hosts:

mycheckpoint --host=localhost --monitored-host=192.168.10.178

The above monitors the MySQL server on 192.168.10.178, and writes down to localhost (to be queried later)

mycheckpoint --monitored-host=127.0.0.1 --host=192.168.10.178

The above monitors the MySQL server on 127.0.0.1, and writes down to 192.168.10.178.

Continue reading » “New and noteworthy in mycheckpoint (rev. 57)”

In favour of a milestone based release model

I like milestone based release models.

The advantages I find in this model are in particular beneficial for MySQL. What I find good about this model are:

  • Things are unstable for shorter periods. Even if some feature is not full stable in some milestone, the model encourages that such a feature is fixed on higher priority.
  • It is easy to create a priority ranking for new features. Moreover, priorities are expressed more by chronological time of development, less by “how many people are working on it”.
  • The model pushes towards rapid development, since you can’t release M5 before M4 is complete.

The last versions of MySQL took long time to complete. Take 5.1, for example: partitioning and event scheduling were long considered GA before row-based replication was half stable. Consider the so small but useful sub-second slow logs; the variables made dynamic in 5.1 (slow log again, for example); the new INFORMATION_SCHEMA tables.

Continue reading » “In favour of a milestone based release model”