Impact of foreign keys absence on replicating slaves

In this post I describe what happens when a slave’s Foreign Key setup is different from that of the master. I’m in particular interested in a setup where the slave has a subset of the master’s foreign keys, or no foreign keys at all. I wish to observe whether integrity holds.

Making the changes

Which foreign keys do we have and how do we drop them? If you want to do this by hand, well, good luck! Fortunately, common_schema provides with quite a few handy views and routines to assist us. Consider viewing the existing foreign keys on sakila:

master> SELECT create_statement FROM common_schema.sql_foreign_keys WHERE TABLE_SCHEMA='sakila';
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| create_statement                                                                                                                                                                                |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ALTER TABLE `sakila`.`address` ADD CONSTRAINT `fk_address_city` FOREIGN KEY (`city_id`) REFERENCES `sakila`.`city` (`city_id`) ON DELETE RESTRICT ON UPDATE CASCADE                             |
| ALTER TABLE `sakila`.`city` ADD CONSTRAINT `fk_city_country` FOREIGN KEY (`country_id`) REFERENCES `sakila`.`country` (`country_id`) ON DELETE RESTRICT ON UPDATE CASCADE                       |
| ALTER TABLE `sakila`.`customer` ADD CONSTRAINT `fk_customer_address` FOREIGN KEY (`address_id`) REFERENCES `sakila`.`address` (`address_id`) ON DELETE RESTRICT ON UPDATE CASCADE               |
| ALTER TABLE `sakila`.`customer` ADD CONSTRAINT `fk_customer_store` FOREIGN KEY (`store_id`) REFERENCES `sakila`.`store` (`store_id`) ON DELETE RESTRICT ON UPDATE CASCADE                       |
| ALTER TABLE `sakila`.`film` ADD CONSTRAINT `fk_film_language` FOREIGN KEY (`language_id`) REFERENCES `sakila`.`language` (`language_id`) ON DELETE RESTRICT ON UPDATE CASCADE                   |
| ALTER TABLE `sakila`.`film` ADD CONSTRAINT `fk_film_language_original` FOREIGN KEY (`original_language_id`) REFERENCES `sakila`.`language` (`language_id`) ON DELETE RESTRICT ON UPDATE CASCADE |
| ALTER TABLE `sakila`.`film_actor` ADD CONSTRAINT `fk_film_actor_actor` FOREIGN KEY (`actor_id`) REFERENCES `sakila`.`actor` (`actor_id`) ON DELETE RESTRICT ON UPDATE CASCADE                   |
| ALTER TABLE `sakila`.`film_actor` ADD CONSTRAINT `fk_film_actor_film` FOREIGN KEY (`film_id`) REFERENCES `sakila`.`film` (`film_id`) ON DELETE RESTRICT ON UPDATE CASCADE                       |
| ALTER TABLE `sakila`.`film_category` ADD CONSTRAINT `fk_film_category_category` FOREIGN KEY (`category_id`) REFERENCES `sakila`.`category` (`category_id`) ON DELETE RESTRICT ON UPDATE CASCADE |
| ALTER TABLE `sakila`.`film_category` ADD CONSTRAINT `fk_film_category_film` FOREIGN KEY (`film_id`) REFERENCES `sakila`.`film` (`film_id`) ON DELETE RESTRICT ON UPDATE CASCADE                 |
| ALTER TABLE `sakila`.`inventory` ADD CONSTRAINT `fk_inventory_film` FOREIGN KEY (`film_id`) REFERENCES `sakila`.`film` (`film_id`) ON DELETE RESTRICT ON UPDATE CASCADE                         |
| ALTER TABLE `sakila`.`inventory` ADD CONSTRAINT `fk_inventory_store` FOREIGN KEY (`store_id`) REFERENCES `sakila`.`store` (`store_id`) ON DELETE RESTRICT ON UPDATE CASCADE                     |
| ALTER TABLE `sakila`.`payment` ADD CONSTRAINT `fk_payment_customer` FOREIGN KEY (`customer_id`) REFERENCES `sakila`.`customer` (`customer_id`) ON DELETE RESTRICT ON UPDATE CASCADE             |
| ALTER TABLE `sakila`.`payment` ADD CONSTRAINT `fk_payment_rental` FOREIGN KEY (`rental_id`) REFERENCES `sakila`.`rental` (`rental_id`) ON DELETE SET NULL ON UPDATE CASCADE                     |
| ALTER TABLE `sakila`.`payment` ADD CONSTRAINT `fk_payment_staff` FOREIGN KEY (`staff_id`) REFERENCES `sakila`.`staff` (`staff_id`) ON DELETE RESTRICT ON UPDATE CASCADE                         |
| ALTER TABLE `sakila`.`rental` ADD CONSTRAINT `fk_rental_customer` FOREIGN KEY (`customer_id`) REFERENCES `sakila`.`customer` (`customer_id`) ON DELETE RESTRICT ON UPDATE CASCADE               |
| ALTER TABLE `sakila`.`rental` ADD CONSTRAINT `fk_rental_inventory` FOREIGN KEY (`inventory_id`) REFERENCES `sakila`.`inventory` (`inventory_id`) ON DELETE RESTRICT ON UPDATE CASCADE           |
| ALTER TABLE `sakila`.`rental` ADD CONSTRAINT `fk_rental_staff` FOREIGN KEY (`staff_id`) REFERENCES `sakila`.`staff` (`staff_id`) ON DELETE RESTRICT ON UPDATE CASCADE                           |
| ALTER TABLE `sakila`.`staff` ADD CONSTRAINT `fk_staff_address` FOREIGN KEY (`address_id`) REFERENCES `sakila`.`address` (`address_id`) ON DELETE RESTRICT ON UPDATE CASCADE                     |
| ALTER TABLE `sakila`.`staff` ADD CONSTRAINT `fk_staff_store` FOREIGN KEY (`store_id`) REFERENCES `sakila`.`store` (`store_id`) ON DELETE RESTRICT ON UPDATE CASCADE                             |
| ALTER TABLE `sakila`.`store` ADD CONSTRAINT `fk_store_address` FOREIGN KEY (`address_id`) REFERENCES `sakila`.`address` (`address_id`) ON DELETE RESTRICT ON UPDATE CASCADE                     |
| ALTER TABLE `sakila`.`store` ADD CONSTRAINT `fk_store_staff` FOREIGN KEY (`manager_staff_id`) REFERENCES `sakila`.`staff` (`staff_id`) ON DELETE RESTRICT ON UPDATE CASCADE                     |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Most of the foreign key constraints use RESTRICT for DELETE (meaning you are not allowed to delete a parent row when children exist), and CASCADE for UPDATE (meaning changes to parent will propagate to children). This is good, since I want to test behavior of both RESTRICT and CASCADE.

Continue reading » “Impact of foreign keys absence on replicating slaves”

Webinar review: Zero-Downtime Schema Changes In MySQL

Yesterday I attended the Zero-Downtime Schema Changes In MySQL webinar by Baron Schwartz, Percona (do you say “attended” for something you listened to from your home office?)

I was keen to learn about possible enhancements and improvements of pt-online-schema-change over oak-online-alter-table. Here are my impressions:

The base logic of pt-online-schema-change is essentially the same as of oak-online-alter-table. You create a ghost/shadow table, create complex triggers, copy in chunks, freeze and swap. Both work on any type of PRIMARY KEY (oak-online-alter-table can work with any UNIQUE KEY, I’m not sure about pt-online-schema-change on this), be it an INTEGER, other type, or a multi column one.

However, pt-online-schema-change also adds the following:

  • It supports FOREIGN KEYs (to some extent). This is something I’ve wanted to do with oak-online-alter-table but never got around to it. Foreign keys are very tricky, as Baron noted. With child-side keys, things are reasonably manageable. With parent-side this becomes a nightmare, sometimes unsolvable (when I say “unsolvable”, I mean that under the constraint of having the operation run in a non-blocking, transparent way).
  • Chunk size is auto-calculated by the script. This is a cool addition. Instead of letting the user throwing out numbers like 1,000 rows per chunk, in the hope that this is neither too small nor too large, the tool monitors the time it takes a chunk to complete, then adjusts the size of next chunk accordingly. Hopefully this leads to a more optimized run, where locks are only held for very short periods, yet enough rows are being processed at a time.
  • The tool looks into replicating slaves to verify they’re up to the job. If the slave lags too far, the tool slows down the work. This is an excellent feature, and again, one that I always wanted to have. Great work!

So the three bullets above are what I understand to be the major advantages of Percona’s tool over oak-online-alter-table.

Q & A

The presentation itself was very good, and Baron answered some questions. There was one question he did not answer during the webinar, nor here, and I though I may pop in and answer it. Although I can’t speak for the coders of pt-online-schema-change, I safely assume that since the logic follows that of oak-online-alter-table, the same answer applies in the case of Percona’s toolkit. Continue reading » “Webinar review: Zero-Downtime Schema Changes In MySQL”

Installing MySQL from source/binary tarball as a Linux service

I’ve written before I prefer to do a manual install of MySQL over a repository one. I still do: I typically install from binary tarball or by compiling from source.

I’d like to share my setup procedure for Linux installation and service setup. I’ve done this dozens of times, on different Linux flavors, and it works well for me.

Installing from source

To get this straight: you sometimes have to compile the source files. I, for example, happen to use the Sphinx MySQLSE extension. You can only use it if compiled with MySQL. You had to compile a “vanilla” 5.1 version without query cache in order to completely remove the cache’s mutex contention.

Anyway, I find the easiest way is to install onto a path associated with the server version. For example, I would install a 5.5 server onto /usr/local/mysql55

This way, a new version gets its own path, and no ambiguity.

To do that, use the prefix option on configuration step:

cd /path/to/extracted/source/tarball
sh BUILD/autorun.sh
./configure --prefix=/usr/local/mysql55
make
sudo make install

Once this is complete, you have everything under /usr/local/mysql55. This means binaries, libraries, scripts, etc.

To install the MySQL server as a service, copy the mysql.server script to /etc/init.d:

sudo cp /usr/local/mysql55/support-files/mysql.server /etc/init.d/mysql55

Again, I’m naming the script after the MySQL version. This avoids conflict with possible past or future installations of the MySQL server, which typically create a service named mysql or mysqld. Continue reading » “Installing MySQL from source/binary tarball as a Linux service”

It’s *that time* of the year

Even without attending the Percona Live conference in Santa Clara, you could tell something big was going on.

One way of measuring it was by looking at the flow of announcements. Here’s a brief list, and apologies if I’ve missed anyone:

All within the first days of the conference.

What this means, over the surface

I read a post by someone who was ranting about Oracle making so many announcements just as the conference began. He obviously suspected there was no coincidence. I got the impression he was looking at it the wrong way: as if Oracle’s announcements came to discourage the relevance of the conference.

I beg the opposite.

Continue reading » “It’s *that time* of the year”

common_schema talk at Percona Live

Are you attending PerconaLive?

Allow me to suggest you attend the Common Schema: a framework for MySQL server administration session on April 12, 14:00 – 14:50 @ Ballroom F.

This talk is by none other than Roland Bouman. Roland co-authored parts of common_schema, and is a great speaker.

I have a personal interest, of course, being the author of most of the components in common_schema. I would like to convert you to a supporter of this project. I know a few very smart people who think this project is an important tool. I would like more people to get to know it. Eventually, I would like developers and DBAs alike to consider it an inseparable part of any MySQL installation.

Then I shall have world domination, Bwa ha ha!

PS,

Have fun, I will unfortunately not attend myself this year. Having been on the program committee, I can tell it’s going to be a great conference!

Thoughts on using MySQL plugins

I’m giving thoughts on the viability of MySQL plugins. This is due to a particular experience I’ve had, which is thankfully solved. However, it left some bitter taste in my mouth.

MySQL plugins are a tricky business. To create a plugin, you must compile it against the MySQL version you wish the users to use it with. Theoretically, you should compile it against any existing MySQL version, minors as well (I’m not sure whether it may sometimes or most times work across minor versions).

But, most important, you must adapt your plugin to major versions.

Another option for plugin makers, is to actually not recompile it, but rather provide with the source code, and let the end user compile it with her own MySQL version. But here, too, the code must be compatible with whatever changes the new MySQL version may have.

And if it doesn’t compile with the new MySQL version?

That’s what happened to me. The particular case at hand was SphinxSE, a plugin which serves as a bridge between MySQL and a Sphinx Search server. I’ve been using it for years and was happy about it. But, as it happened, it took well over a year for sphinx to compile with MySQL 5,5. This meant I was unable to upgrade my 5.1 installation to 5.5, a thing I was aiming to do for quite a while. Continue reading » “Thoughts on using MySQL plugins”

sphinx, sphinx_snippets() & MySQL 5.5

I’ve written a patch which completes Sphinx’s integration with MySQL 5.5.

Up until a couple months ago, Sphinx would not compile with MySQL 5.5 at all. This is, thankfully, resolved as of Sphinx 2.0.3.

However, to my disdain, I’ve found out that it only partially work: the sphinx_snippets() user defined function is not included within the plugin library. After some quick poking I discovered that it was not added to the build, and when added, would not compile.

I rely on sphinx_snippets() quite a lot, and like it. Eventually I wrote the fix to the snippets_udf.cc which allows it to run in a MySQL 5.5 server.

Here are the changes for the 2.0.4 version of Sphinx:

Replace your 2.0.4 files with these two and get on compiling your MySQL server.

Compilation guide

For completeness, here’s how to compile Percona Server 5.5 with Sphinx 2.0.4 including the above patches: Continue reading » “sphinx, sphinx_snippets() & MySQL 5.5”

Auto caching INFORMATION_SCHEMA tables: seeking input

The short version

I have it all working. It’s kind of magic. But there are issues, and I’m not sure it should even exist, and am looking for input.

The long version

In Auto caching tables I presented with a hack which allows getting cached or fresh results via a simple SELECT queries.

The drive for the above hack was INFORMATION_SCHEMA tables. There are two major problems with INFORMATION_SCHEMA:

  1. Queries on schema-oriented tables such as TABLES, COLUMNS, STATISTICS, etc. are heavyweight. How heavyweight? Enough to make a lockdown of your database. Enough to crash down your database in some cases.
  2. The data is always generated on-the-fly, as you request it. Query the COLUMNS table twice, and risk two lockdowns of your database.

The auto-cache mechanism solves issue #2. I have it working, time based. I have an auto-cache table for each of the INFORMATION_SCHEMA heavyweight tables. Say, every 30 minutes the cache is invalidated. Throughout those 30 minutes, you get a free pass!

The auto-cache mechanism also paves the road to solving issue #1: since it works by invoking a stored routine, I have better control of the way I read INFORMATION_SCHEMA. This, I can take advantage of INFORMATION_SCHEMA optimization. It’s tedious, but not complicated.

For example, if I wanted to cache the TABLES table, I don’t necessarily read the entire TABLES data in one read. Instead, I can iterate the schemata, get a list of table names per schema, then read full row data for these, table by table. The result? Many many more SELECTs, but more optimized, and no one-big-lock-it-all query.

And the problem is…

Continue reading » “Auto caching INFORMATION_SCHEMA tables: seeking input”

Auto caching tables

Is there a way to create a caching table, some sort of a materialized view, such that upon selecting from that table, its data is validated/invalidated?

Hint: yes.

But to elaborate the point: say I have some table data_table. Can I rewrite all my queries which access data_table to read from some autocache_data_table, but have nothing changed in the query itself? No caveats, no additional WHEREs, and still have that autocache_data_table provide with the correct data, dynamically updated by some rule of our choice?

And: no crontab, no event scheduler, and no funny triggers on data_table? In such way that invalidation/revalidation occurs upon SELECT?

Well, yes.

This post is long, but I suggest you read it through to understand the mechanism, it will be worthwhile.

Background

The following derives from my long research on how to provide better, faster and safer access to INFORMATION_SCHEMA tables. It is however not limited to this exact scenario, and in this post I provide with a simple, general purpose example. I’ll have more to share about INFORMATION_SCHEMA specific solutions shortly.

I was looking for a server side solution which would not require query changes, apart from directing the query to other tables. Solution has to be supported by all standard MySQL installs; so: no plugins, no special rebuilds. Continue reading » “Auto caching tables”

MySQL monitoring: storing, not caching

I’ve followed with interest on Baron’s Why don’t our new Nagios plugins use caching? and Sheeri’s Caching for Monitoring: Timing is Everything. I wish to present my take on this, from mycheckpoint‘s point of view.

So mycheckpoint works in a completely different way. On one hand, it doesn’t bother with caching. On the other hand, it doesn’t bother with re-reads of data.

There are no staleness issues, the data is consistent as it can get (you can never get a completely atomic read of everything in MySQL), and you can issue as many calculations as you want at the price of one take of monitoring. As in Sheere’s example, you can run Threads_connected/max_connections*100, mix status variables, system variables, meta-variables (e.g. Seconds_behind_master), user-created variables (e.g. number of purchases in your online shop) etc.

mycheckpoint‘s concept is to store data. And store it in relational format. That is, INSERT it to a table.

A sample-run generates a row, which lists all status, server, OS, user, meta variables. It’s a huge row, with hundreds of columns. Columns like threads_connected, max_connections, innodb_buffer_pool_size, seconds_behind_master, etc.

mycheckpoint hardly cares about these columns. It identifies them dynamically. Have you just upgraded to MySQL 5.5? Oh, there’s a new bunch of server and status variables? No problem, mycheckpoint will notice it doesn’t have the matching columns and will add them via ALTER TABLE. There you go, now we have a place to store them.

Running a formula like Threads_connected/max_connections*100 is as easy as issuing the following query:

SELECT Threads_connected/max_connections*100 FROM status_variables WHERE id = ...

Hmmm. This means I can run this formula on the most recent row I’ve just added. But wait, this also means I can run this formula on any row I’ve ever gathered. Continue reading » “MySQL monitoring: storing, not caching”