Schema – code.openark.org http://shlomi-noach.github.io/blog/ Blog by Shlomi Noach Wed, 13 Jul 2011 04:25:24 +0000 en-US hourly 1 https://wordpress.org/?v=5.3.3 32412571 Announcing common_schema: common views & routines for MySQL https://shlomi-noach.github.io/blog/mysql/announcing-common_schema-common-views-routines-for-mysql https://shlomi-noach.github.io/blog/mysql/announcing-common_schema-common-views-routines-for-mysql#comments Wed, 13 Jul 2011 04:25:24 +0000 https://shlomi-noach.github.io/blog/?p=3794 Today I have released common_schema, a utility schema for MySQL which includes many views and functions, and is aimed to be installed on any MySQL server.

What does it do?

There are views answering for all sorts of useful information: stuff related to schema analysis, data dimensions, monitoring, processes & transactions, security, internals… There are basic functions answering for common needs.

Some of the views/routines simply formalize those queries we tend to write over and over again. Others take the place of external tools, answering complex questions via SQL and metadata. Still others help out with SQL generation.

Here are a few highlights:

There’s more. Take a look at the common_schema documentation for full listing. And it’s evolving: I’ve got quite a few ideas already for future components.

Some of these views rely on heavyweight INFORMATION_SCHEMA tables. You should be aware of the impact and risks.

What do I need to install?

There’s no script or executable file. It’s just a schema. The distribution in an SQL file which generates common_schema. Much like a dump file.

What are the system requirements?

It’s just between you and your MySQL. There are currently three distribution files, dedicated for different versions of MySQL (and allowing for increased functionality):

  • common_schema_mysql_51: fits all MySQL >= 5.1 distributions
  • common_schema_innodb_plugin: fits MySQL >= 5.1, with InnoDB plugin + INFORMATION_SCHEMA tables enabled
  • common_schema_percona_server: fits Percona Server >= 5.1

Refer to the documentation for more details.

What are the terms of use?

common_schema is released under the BSD license.

Where can I download it?

On the common_schema project page. Enjoy it!

]]>
https://shlomi-noach.github.io/blog/mysql/announcing-common_schema-common-views-routines-for-mysql/feed 7 3794
Reasons to use AUTO_INCREMENT columns on InnoDB https://shlomi-noach.github.io/blog/mysql/reasons-to-use-auto_increment-columns-on-innodb https://shlomi-noach.github.io/blog/mysql/reasons-to-use-auto_increment-columns-on-innodb#comments Tue, 22 Mar 2011 06:31:18 +0000 https://shlomi-noach.github.io/blog/?p=3196 An InnoDB table must have a primary key (one is created if you don’t do it yourself). You may have a natural key at hand. Stop! Allow me to suggest an AUTO_INCREMENT may be better.

Why should one add an AUTO_INCREMENT PRIMARY KEY on a table on which there’s a natural key? Isn’t an AUTO_INCREMENT a pseudo key, meaning, it doesn’t have any explicit relation to the row data, other than it is a number and unique?

Yes, indeed so. Nevertheless, consider:

  • Natural keys are many times multi-columned.
  • Multi column PRIMARY KEYs make for larger keys, and make for bloated secondary keys as well. You may be wasting space for storing the additional AUTO_INCREMENT column, but you may gain space back on secondary keys.
  • Multi column PRIMARY KEYs make for more locks. See also this post.
  • InnoDB INSERTs work considerably faster when worked in ascending PRIMARY KEY order. Can you ensure your natural key is in such order?
  • Even though an AUTO_INCREMENT makes for an INSERT bottleneck (values must be given serially), it is in particular helpful to InnoDB by ensuring PRIMARY KEY values are in ascending order.
  • AUTO_INCEMENT makes for chronological resolution. You know what came first, and what came next.
  • In many datasets, more recent entries are often being accessed more, and are therefore “hotter”. By using AUTO_INCREMENT, you’re ensuring that recent entries are grouped together within the B+ Tree. This means less random I/O when looking for recent data.
  • A numerical key is in particular helpful in splitting your table (and tasks on your table) into smaller chunks. I write tools which can work out with any PRIMARY KEY combination, but it’s easier to work with numbers.
]]>
https://shlomi-noach.github.io/blog/mysql/reasons-to-use-auto_increment-columns-on-innodb/feed 9 3196
Thoughts and ideas for Online Schema Change https://shlomi-noach.github.io/blog/mysql/thoughts-and-ideas-for-online-schema-change https://shlomi-noach.github.io/blog/mysql/thoughts-and-ideas-for-online-schema-change#comments Thu, 07 Oct 2010 08:29:10 +0000 https://shlomi-noach.github.io/blog/?p=3005 Here’s a few thoughts on current status and further possibilities for Facebook’s Online Schema Change (OSC) tool. I’ve had these thoughts for months now, pondering over improving oak-online-alter-table but haven’t got around to implement them nor even write them down. Better late than never.

The tool has some limitations. Some cannot be lifted, some could. Quoting from the announcement and looking at the code, I add a few comments. I conclude with a general opinion on the tool’s abilities.

“The original table must have PK. Otherwise an error is returned.”

This restriction could be lifted: it’s enough that the table has a UNIQUE KEY. My original oak-online-alter-table handled that particular case. As far as I see from their code, the Facebook code would work just as well with any unique key.

However, this restriction is of no real interest. As we’re mostly interested in InnoDB tables, and since any InnoDB table should have a PRIMARY KEY, we shouldn’t care too much.

“No foreign keys should exist. Otherwise an error is returned.”

Tricky stuff. With oak-online-alter-table, changes to the original table were immediately reflected in the ghost table. With InnoDB tables, that meant same transaction. And although I never got to update the text and code, there shouldn’t be a reason for not using child-side foreign keys (the child-side is the table on which the FK constraint is defined).

The Facebook patch works differently: it captures changes and writes them to a delta table,  to be later (asynchronously) analyzed and make for a replay of actions on the ghost table.

So in the Facebook code, some cases will lead to undesired behavior. Consider two tables, country and city, with city holding a RESTRICT/NO ACTION foreign key on country‘s id. Now consider the scenario:

  1. Rows from city are DELETEd, where the country Id is Spain’s.
    • city‘s ghost table is still unaffected, Spain’s cities are still there.
    • A change is written to the delta table to mark these rows for deletion.
  2. A DELETE is issued on country‘s Spain record.
    • The DELETE should work, from the user’s perspective
    • But it will fail: city’s ghost table has not received the changes yet. There’s still matching rows. The NO ACTION constraint will fail the DELETE statement.

Now, this does not lead to corruption, just to seemingly unreasonable behavior on the database part. This behavior is probably undesired. NO ACTION constraint won’t do.

However, with CASCADE or SET NULL options, there is less of an issue: operations on the parent table (e.g. country) cannot fail. We must make sure operations on the ghost table make it consistent with the original table (e.g. city).

Consider the following scenario:

  1. A new country is created, called “Sleepyland”. An INSERT is made to country.
    • Both city and city‘s ghost are immediately aware of it.
  2. A new town is created and INSERTed to city. The town is called “Naphaven”.
    • The change takes time to propagate to city‘s ghost table.
  3. Meanwhile, we realized we made a mistake. We’ve been had. There’s no such city nor country.
    1. We DELETE “Naphaven” from city.
    2. We DELETE “Sleepyland” from country.
    • Note that city‘s ghost table still hasn’t caught up with the changes.
  4. Eventually, the INSERT statement for “Naphaven” reaches city‘s ghost table.
    • What should happen now? The INSERT cannot succeed.
    • Will this fail the entire process?

Looking at the PHP code, I see that changes written on the delta table are blindly replayed on the ghost table.

Since the process is asynchronous, this should not be the case. We can solve the above if we use INSERT IGNORE instead of INSERT. The statement will fail without failing anything else. The row cannot exist, and that’s because the original row does not exist anymore.

Unlike a replication corruption, this does not lead to accumulation mistakes. The replay is static, somewhat like in binary log format. Changes are just written, regardless of existing data.

I have given this considerable thought, and I can’t say I’ve covered all the possible scenario. However I believe that with proper use of INSERT IGNORE and REPLACE INTO (two statements I heavily relied on with oak-online-alter-table), correctness can be achieved.

There’s the small pain of re-generating the foreign key definition on the “ghost” table (CREATE TABLE LIKE … does not copy FK definitions). And since foreign key names are unique, a new name must be picked up. Not pretty, but perfectly doable.

“No AFTER_{INSERT/UPDATE/DELETE} triggers must exist.”

It would be nicer if MySQL had an ALTER TRIGGER statement. There isn’t such statement. If there were such an atomic statement, then we would be able to rewrite the trigger, so as to add our own code to the end of the trigger’s code. Yuck. Would be even nicer if we were allowed to have multiple triggers of same event.

So, we are left with DROP and CREATE triggers. Alas, this makes for a short period where the trigger does not exist. Bad. The easy solution would be to LOCK WRITE the table, but apparently you can’t DROP the trigger (*) when the table is locked. Sigh.

(*) Happened to me, apparently to Facebook too; With latest 5.1 (5.1.51) version this actually works. With 5.0 it didn’t use to; this needs more checking.

Use of INFORMATION_SCHEMA

As with oak-online-alter-table, the OSC checks for triggers, indexes, column by searching on the INFORMATION_SCHEMA tables. This makes for nice SQL for getting the exact listing and types of PRIMARY KEY columns, whether or not AFTER triggers exist, and so on.

I’ve always considered this to be the weak part of openark-kit, that it relies on INFORMATION_SCHEMA so much. It’s easier, it’s cleaner, it’s even more correct to work that way — but it just puts too much locks. I think Baron Schwartz (and now Daniel Nichter) did amazing work on analyzing table schemata by parsing the SHOW CREATE TABLE and other SHOW commands regex-wise with Maatkit. It’s a crazy work! Had I written openark-kit in Perl, I would have just import their code. But I’m too lazy busy to do the conversion from Perl to Python, and rewrite that code, what with all the debugging.

OSC is written in PHP. Again, much conversion work. I think performance-wise this is an important step to make.

A word for the critics

Finally, a word for the critics. I’ve read some Facebook/MySQL bashing comments and wish to relate.

In his interview to The Register, Mark Callaghan gave the example that “Open Schema Change lets the company update indexes without user downtime, according to Callaghan”.

PostgreSQL was mentioned for being able to add index with only read locks taken, or being able to do the work with no locks using CREATE INDEX CONCURRENTLY. I wish MySQL had that feature! Yes, MySQL has a lot to improve upon, and the latest PostgreSQL 9.0 brings valuable new features. (Did I make it clear I have no intention of bashing PostgreSQL? If not, please re-read this paragraph until convinced).

Bashing related to the notion of MySQL being so poor that Facebook used an even poorer mechanism to work out the ALTER TABLE.

Well, allow me to add a few words: the CREATE INDEX is by far not the only thing you can achieve with OSC (although it may be Facebook’s major concern). You should be able to:

  • Add columns
  • Drop columns
  • Convert character sets
  • Modify column types
  • Add partitioning
  • Reorganize partitioning
  • Compress the table
  • Otherwise changing table format
  • Heck, you could even modify the storage engine! (To other transactional engine)

These are giant steps. How easy would it be to write these down into the database? It only takes a few weeks time to work out a working solution with reasonable limitations, just using the resources the MySQL server provides you with. The MySQL@Facebook team should be given credit for that.

]]>
https://shlomi-noach.github.io/blog/mysql/thoughts-and-ideas-for-online-schema-change/feed 8 3005
mk-schema-change? Check out ideas from oak-online-alter-table https://shlomi-noach.github.io/blog/mysql/mk-schema-change-check-out-ideas-from-oak-online-alter-table https://shlomi-noach.github.io/blog/mysql/mk-schema-change-check-out-ideas-from-oak-online-alter-table#comments Wed, 10 Mar 2010 18:28:29 +0000 https://shlomi-noach.github.io/blog/?p=2144 In response to Mark Callaghan’s post mk-schema-change.

I apologize for not commenting on the post itself, I do not hold a Facebook account. Anyway this is a long write, so it may as well deserve a post of its own.

Some of the work Mark is describing already exists under openark kit‘s oak-online-alter-table. Allow me to explain what I have gained there, and how the issue can be further pursued. There is relevance to Mark’s suggestion.

oak-online-alter-table uses a combination of locks, chunks and triggers to achieve an almost non-blocking ALTER TABLE effect. I had a very short opportunity to speak with Mark on last year’s conference, in between bites. Mark stated that anything involving triggers was irrelevant in his case.

The triggers are a pain, but I believe a few other insights from oak-online-alter-table can be of interest.

The first attempt

My first attempt with the script assumed:

  • Table has an AUTO_INCREMENT PRIMARY KEY column
  • New rows always gain ascending PRIMARY KEY values
  • PRIMARY KEY never changes for an existing row
  • PRIMARY KEY values are never reused
  • Rows may be deleted at will
  • No triggers exist on the table
  • No FOREIGN KEYs exist on the table.

So the idea was: when one wants to do an ALTER TABLE:

  1. Create a ghost table with the new structure.
  2. Read the minimum and maximum PK values.
  3. Create AFTER INSERT, AFTER UPDATE, AFTER DELETE triggers on the original table. These triggers will propagate the changes onto the ghost table.
  4. Working out slowly, and in small chunks, copy rows within recorded min-max values range into the ghost table. The interesting part is where the script makes sure there’s no contradiction between these actions and those of the triggers, (whichever came first!). This is largely solved using INSERT IGNORE and REPLACE INTO in the proper context.
  5. Working out slowly and in chunks again, we remove rows from the ghost table, which are no longer existent in the original table.
  6. Once all chunking is complete, RENAME original table to *_old, and ghost table in place of the original table.

Steps 4 & 5 are similar in concept to transactional recovery through redo logs and undo logs.

The next attempt

Next phase removed the AUTO_INCREMENT requirement, as well as the “no reuse of PK”. In fact, the only remaining constraints were:

  • There is some UNIQUE KEY on the table which is unaffected by the ALTER operation
  • No triggers exist on the table
  • No FOREIGN KEYs exist on the table.

The steps are in general very similar to those listed previously, only now a more elaborate chunking method is used with possible non-integer, possible multi-column chunking algorithm. Also, the triggers take care of changes in UNIQUE KEY values themselves.

mk-schema-change?

Have a look at the wiki pages for OnlineAlterTable*. There is some discussion on concurrency issues; on transactional behavior, which explains why oak-online-alter-table performs correctly. Some of these are very relvant, I believe, to Mark’s suggestion. In particular, making the chunks copy; retaining transactional integrity, etc.

To remove any doubt, oak-online-alter-table is not production ready or anywhere near. Use at your own risk. I’ve seen it work, and I’ve seen it crash. I got little feedback and thus little chance to fix things. I also didn’t touch the code for quite a few months now, so I’m a little rusty myself.

]]>
https://shlomi-noach.github.io/blog/mysql/mk-schema-change-check-out-ideas-from-oak-online-alter-table/feed 11 2144
Common wrong Data Types compilation https://shlomi-noach.github.io/blog/mysql/common-data-types-errors-compilation https://shlomi-noach.github.io/blog/mysql/common-data-types-errors-compilation#comments Tue, 18 Nov 2008 07:37:57 +0000 https://shlomi-noach.github.io/blog/?p=85 During my work with companies using MySQL, I have encountered many issues with regard to schema design, normalization and indexing. Of the most common errors are incorrect data types definition. Many times the database is designed by programmers or otherwise non-expert DBAs. Some companies do not have the time and cannot spare the effort of redesigning and refactoring their databases, and eventually face poor performance issues.

Here’s a compilation of “the right and the wrong” data types.

  • INT(1) is not one byte long. INT(10) is no bigger than INT(2). The number in parenthesis is misleading, and only describes the text alignment of the number, when displayed in an interactive shell. All mentioned types are the same INT, have the same storage capacity, and the same range. If you want a one-byte INT, use TINYINT.
  • An integer PRIMARY KEY is preferable, especially if you’re using the InnoDB storage engine. If possible, avoid using VARCHAR as PRIMARY KEY. In InnoDB, this will make the clustered index deeper, secondary indexes larger (sometimes much larger) and look ups slower.
  • Do not use VARCHAR to represent timestamps. It may look like '2008-11-14 07:59:13' is a textual field, but in fact it’s just an integer counting the seconds elapsed from 1970-01-01. That’s 4 bytes vs. 19 if you’re using CHAR with ASCII charset, or more if you’re using UTF8 or VARCHAR.
  • Do not use VARCHAR to represent IPv4 addresses. This one is quite common. The IP 192.168.100.255 can be represented with VARCHAR(15), true, but could be better represented with a 4-byte int. That’s what IPv4 is: four bytes. Use the INET_ATON() and INET_NTOA() functions to translate between the INT value and textual value.
  • This one should be obvious, but I’ve seen it in reality, where the schema was auto generated by some naive generator: do not represent numbers as text. Yes, I have seen integer columns represented by VARCHAR. Don’t ask how the performance was.
  • MD5() columns shouldn’t be VARCHAR. Use CHAR(32) instead. It’s always 32 bytes long, so no need for VARCHAR‘s additional byte overhead. If your tables or database are UTF8 by default, make sure the MD5 column’s charset is ASCII, or it will consume 96 bytes instead of just 32. I also suggest the case-sensitive ascii_bin collation, but that’s a more minor issue.
  • PASSWORD() columns shouldn’t be VARCHAR, but CHAR. The length depends on whether you’re using old-passwords variable (for some strange reason, this variable always appears in the MySQL sample configuration files – though you really don’t want it unless it’s for backward compatibility with older MySQL versions). As in the MD5 note, use ASCII charset.
  • Better use TIMESTAMP than INT to count seconds, as MySQL has many supportive functions for this data type.
  • Use TINYINT, SMALLINT, MEDIUMINT instead of INT when possible. Do you expect to have 4000000000 customers? No? Then a “id SMALLINT” may suffice as PRIMARY KEY.
  • Use CHARACTER SETs with care. More on this on future posts.
]]>
https://shlomi-noach.github.io/blog/mysql/common-data-types-errors-compilation/feed 23 85