Slides for “openark-kit: MySQL utilities for everyday use”

Today I have delivered my talk, openark-kit: MySQL utilities for everyday use, at the O’REILLY  MySQL Conference 2011.

The slides are uploaded to the O’Reilly site, and I’m attaching them here as well. Feel free to download the PDF: openark-kit-mysqlconf11.pdf

I wish to thank all who attended my talk!

 

Checking for AUTO_INCREMENT capacity with single query

Darn! This means oak-show-limits becomes redundant. Am I not supposed to speak about it on my coming presentation? Bad timing!

You have AUTO_INCREMENT columns. How far are you pushing the limits? Are you going to run out of AUTO_INCREMENT values soon? Perhaps you wonder whether you should ALTER from INT to BIGINT?

The answer is all there in INFORMATION_SCHEMA. The TABLES table shows the current AUTO_INCREMENT value per table, and the COLUMNS table tells us all about a column’s data type.

It takes some ugly code to deduce the maximum value per column type, what with signed/unsigned and data type, but then its very simple. Here is the query: Continue reading » “Checking for AUTO_INCREMENT capacity with single query”

Recovering a MySQL `root` password: the fourth solution

Have just read Darren Cassar’s Recovering a MySQL `root` password – Three solutions. There’s a fourth solution: using an init-file, which leads to just one restart of the database instead of two. It also avoids the security issue involved with using skip-grant-tables.

I’ve written all about it before on Dangers of skip-grant-tables.

Darren’s 1st advice (look for password ini files, scripts, etc.) is a very good one. One password that can always be looked up in files is the replication’s password.

Replication’s password is easily forgotten: you only set it once and never use it again; never script it nor manually login with. When setting up new slaves, though, you suddenly need it.

Apparently not many realize that the replication password is written in plaintext in the master.info file. This file tells the slave all about it’s master connection: host, port, user & password are all there for you to read.

Reasons to use AUTO_INCREMENT columns on InnoDB

An InnoDB table must have a primary key (one is created if you don’t do it yourself). You may have a natural key at hand. Stop! Allow me to suggest an AUTO_INCREMENT may be better.

Why should one add an AUTO_INCREMENT PRIMARY KEY on a table on which there’s a natural key? Isn’t an AUTO_INCREMENT a pseudo key, meaning, it doesn’t have any explicit relation to the row data, other than it is a number and unique?

Yes, indeed so. Nevertheless, consider:

  • Natural keys are many times multi-columned.
  • Multi column PRIMARY KEYs make for larger keys, and make for bloated secondary keys as well. You may be wasting space for storing the additional AUTO_INCREMENT column, but you may gain space back on secondary keys.
  • Multi column PRIMARY KEYs make for more locks. See also this post.
  • InnoDB INSERTs work considerably faster when worked in ascending PRIMARY KEY order. Can you ensure your natural key is in such order?
  • Even though an AUTO_INCREMENT makes for an INSERT bottleneck (values must be given serially), it is in particular helpful to InnoDB by ensuring PRIMARY KEY values are in ascending order.
  • AUTO_INCEMENT makes for chronological resolution. You know what came first, and what came next.
  • In many datasets, more recent entries are often being accessed more, and are therefore “hotter”. By using AUTO_INCREMENT, you’re ensuring that recent entries are grouped together within the B+ Tree. This means less random I/O when looking for recent data.
  • A numerical key is in particular helpful in splitting your table (and tasks on your table) into smaller chunks. I write tools which can work out with any PRIMARY KEY combination, but it’s easier to work with numbers.

Would you be my friend on mysqlconf? (tempting offer inside)

I’m still throwing papers to the trash and starting all over, fixing, rewriting and improving my talk at mysqlconf 2011, where I will be presenting openark-kit: MySQL utilities for everyday use.

However I’ve got something up my sleeve: a benefit many can enjoy, that’ll make me a respectful, popular and sought after speaker. While others may try and lure you with such earthly temptations as a 20% off discount, I am in a position to offer you a more spiritual gift: my friendship!

See, if you become my friend, I can offer you a 25% discount on the MySQL conference. Yes, that’s 5% more than my competitors! The only thing I ask in return is that you be my friend (hey, it’s called “friends of speaker”). Not like a FB virtual friend, but a real friendship! One where you can buy me beer or dinner!

If you agree to such humane terms, I will be in the position to let you know that all you have to do is fill in mys11fsd in your registration form.

No, wait! I let it slip! Rewrite: You should fill in mys11fsd [will only tell you this password after your commitment to a beer] in your registration form.

Oh no, not again!

Don’t use mys11fsd without talking to me first… You’re not supposed to… Oh, my beer!

Argghhh!

Limiting table disk quota in MySQL

Question asked by a student: is there a way to limit a table’s quote on disk? Say, limit a table to 2GB, after which it will refuse to grow? Note that the requirement is that rows are never DELETEd. The table must simply refuse to be updated once it reaches a certain size.

There is no built-in way to limit a table’s quota on disk. First thing to observe is that MySQL has nothing to do with this. It is entirely up to the storage engine to provide with such functionality. The storage engine is the one to handle data storage: how table and keys are stored on disk. Just consider the difference between MyISAM’s .MYD & .MYI to InnoDB’s shared tablespace ibdata1 to InnoDB’s file-per table .ibd files.

The only engine I know of that has a quota is the MEMORY engine: it accepts the max_heap_table_size, which limits the size of a single table in memory. Hrmmm… In memory…

Why limit?

I’m not as yet aware of the specific requirements of said company, but this is not the first time I heard this question.

The fact is: when MySQL runs out of disk space, it goes with a BOOM. It crashed ungracefully, with binary logs being out of sync, replication being out of sync. To date, and I’ve seen some cases, InnoDB merely crashes and manages to recover once disk space is salvaged, but I am not certain this is guaranteed to be the case. Anyone?

And, with MyISAM…, who knows?

Rule #1 of MySQL disk usage: don’t run out of disk space.

Workarounds

I can think of two workarounds, none of which is pretty. The first involves triggers (actually, a few variations for this one), the second involves privileges. Continue reading » “Limiting table disk quota in MySQL”

Generating Google line charts with SQL, part II

This post continues Generating Google line charts with SQL, part I, in pursue of generating time series based image charts.

We ended last post with the following chart:


http://chart.apis.google.com/chart?cht=lc&chs=400x200&chtt=SQL%20chart&chxt=x,y&chxr=1,-4716.6,5340.0&chd=s:dddddddddeeeeeefffffffffeeeedddcccbbaaZZZYYYXXXXXXXXXYYYZZabbcdeefghhijkkllmmmmmmmmllkkjihgfedcbZYXWVUTSRRQQPPPPQQQRSTUVWXZacdfgijlmnpqrssttuuuttssrqonmkigfdbZXVTSQONMLKJIIIIIIJKLMOPRTVXZbegilnprtvwyz01111110zyxvtrpnkifcaXUSPNLJHFECBBAAABBCEFHJLNQTWZcfilortwy1346789999876420yvspmjfcYVSOL

which has a nice curve, and a proper y-legend, but incorrect x-legend and no ticks nor grids.

To date, Google Image Charts do not support time-series charts. We can’t just throw timestamp values and expect the chart to properly position them. We need to work these by hand.

This is not easily done; if our input consists of evenly spread timestamp values, we are in a reasonable position. If not, what do we do?

There are several solutions to this:

  • We can present whatever points we have on the chart, making sure to position them properly. This makes for an uneven distribution of ticks on the x-axis, and is not pleasant to watch.
  • We can extrapolate values for round hours (or otherwise round timestamp resolutions), and so show evenly spread timestamps. I don’t like this solution one bit, since we’re essentially inventing values here. Extrapolation is nice when you know you have nice curves, but not when you’re doing database monitoring, for example. You must have the precise values.
  • We can do oversampling, then group together several measurements within round timestamp resolutions. For example, we can make a measurement every 2 minutes, yet present only 6 measurements per hour, each averaging up 10 round minutes. This is the approach I take with mycheckpoint.

The latest approach goes even beyond that: what if we missed 30 minutes of sampling? Say the server was down. We then need to “invent” the missing timestamps. Note that we invent the timestamps, we do not invent values. We must present the chart with missing values on our invented timestamps.

I may show how to do this in a future post. Meanwhile, let’s simplify and assume our values are evenly spread. Continue reading » “Generating Google line charts with SQL, part II”

Upgrading passwords from old_passwords to “new passwords”

You have old_passwords=1 in your my.cnf. I’m guessing this is because you used one of the my-small.cnf, my-large.cnf etc. templates provided with your MySQL distribution.

These files can easily win the “most outdated sample configuration file contest”.

Usually it’s no big deal: if some parameter isn’t right, you just go and change it. Some variables, though, have a long-lasting effect, and are not easily reversed.

What’s the deal with old_passwords?

No one should be using these anymore. This variable makes the password hashing algorithm compatible with that of MySQL 4.0. I’m pretty sure 4.0 was released 9 years ago. I don’t know of anyone still using it (or 4.0 client libraries).

The deal is this: with old_passwords you get a 16 hexadecimal digits (64 bit) hashing of your passwords. With so called “new passwords” you get 40 hexadecimal digits (plus extra “*“). So this is about better encryption of your password. Read more on the manual.

How do I upgrade to new password format?

You can’t just put a comment on the “old_passwords=1” entry in the configuration file. If you do so, the next client to connect will attempt to match a 41 characters hashed password to your existing 16 characters entry in the mysql.users table. So you need to make a simultaneous change: both remove the old_passwords entry and set a new password. You must know all accounts’ passwords before you begin.

Continue reading » “Upgrading passwords from old_passwords to “new passwords””

Upgrading to Barracuda & getting rid of huge ibdata1 file

Some of this is old stuff, but more people are now converting to InnoDB plugin, so as to enjoy table compression, performance boosts. Same holds for people converting to Percona’s XtraDB. InnoDB plugin requires innodb_file_per_table. No more shared tablespace file.

So your ibdata1 file is some 150GB, and it won’t reduce. Really, it won’t reduce. You set innodb_file_per_table=1, do ALTER TABLE t ENGINE=InnoDB (optionally ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8), and you get all your tables in file-per-table .ibd files.

But the original ibdata1 file is still there. It has to be there, don’t delete it! It contains more than your old data.

InnoDB tablespace files never reduce in size, it’s an old-time annoyance. The only way to go round it, if you need the space, is to completely drop them and start afresh. That’s one of the things so nice about file-per-table: an ALTER TABLE actually creates a new tablespace file and drops the original one.

The procedure

The procedure is somewhat painful:

On generating unique IDs using LAST_INSERT_ID() and other tools

There’s a trick for using LAST_INSERT_ID() to generate sequences in MySQL. Quoting from the Manual:

  1. Create a table to hold the sequence counter and initialize it:
    mysql> CREATE TABLE sequence (id INT NOT NULL);
    mysql> INSERT INTO sequence VALUES (0);
    
  2. Use the table to generate sequence numbers like this:
    mysql> UPDATE sequence SET id=LAST_INSERT_ID(id+1);
    mysql> SELECT LAST_INSERT_ID();
    

This trick calls for trouble.

Contention

A customer was using this trick to generate unique session IDs for his JBoss sessions. These IDs would eventually be written back to the database in the form of log events. Business go well, and one day the customer adds three new JBoss servers (doubling the amount of webapps). All of a sudden, nothing works quite as it used to. All kinds of queries take long seconds to complete; load average becomes very high. Continue reading » “On generating unique IDs using LAST_INSERT_ID() and other tools”