Introducing audit_login: simple MySQL login logfile based auditing

audit_login is a simple MySQL login auditing plugin, logging any login or login attempt to log file in JSON format.

It seems that audit plugins are all the rage lately… We’ve developed out simple plugin a month ago as part of our database securing efforts; by auditing any login or login attempt we could either intercept or later investigate suspicious logins.

However we quickly realized there is much more to be gathered by this info.

In very short, you install this plugin onto your MySQL server, and your server starts writing into a text file called audit_login.log entries such as follows:

{"ts":"2013-09-11 09:11:47","type":"successful_login","myhost":"gromit03","thread":"74153868","user":"web_user","priv_user":"web_user","host":"web-87.localdomain","ip":"10.0.0.87"}
{"ts":"2013-09-11 09:11:55","type":"failed_login","myhost":"gromit03","thread":"74153869","user":"backup_user","priv_user":"","host":"web-32","ip":"10.0.0.32"}
{"ts":"2013-09-11 09:11:57","type":"failed_login","myhost":"gromit03","thread":"74153870","user":"backup_user","priv_user":"","host":"web-32","ip":"10.0.0.32"}
{"ts":"2013-09-11 09:12:48","type":"successful_login","myhost":"gromit03","thread":"74153871","user":"root","priv_user":"root","host":"localhost","ip":"10.0.0.111"}
{"ts":"2013-09-11 09:13:26","type":"successful_login","myhost":"gromit03","thread":"74153872","user":"web_user","priv_user":"web_user","host":"web-11.localdomain","ip":"10.0.0.11"}
{"ts":"2013-09-11 09:13:44","type":"successful_login","myhost":"gromit03","thread":"74153873","user":"web_user","priv_user":"web_user","host":"web-40.localdomain","ip":"10.0.0.40"}
{"ts":"2013-09-11 09:13:51","type":"successful_login","myhost":"gromit03","thread":"74153874","user":"web_user","priv_user":"web_user","host":"web-03.localdomain","ip":"10.0.0.03"}
{"ts":"2013-09-11 09:14:09","type":"successful_login","myhost":"gromit03","thread":"74153875","user":"web_user","priv_user":"web_user","host":"web-40.localdomain","ip":"10.0.0.40"}
{"ts":"2013-09-11 10:55:25","type":"successful_login","myhost":"gromit03","thread":"74153876","user":"web_user","priv_user":"web_user","host":"web-87.localdomain","ip":"10.0.0.87"}
{"ts":"2013-09-11 10:55:59","type":"successful_login","myhost":"gromit03","thread":"74153877","user":"web_user","priv_user":"web_user","host":"web-12.localdomain","ip":"10.0.0.12"}
{"ts":"2013-09-11 10:55:59","type":"failed_login","myhost":"gromit03","thread":"74153878","user":"(null)","priv_user":"(null)","host":"(null)","ip":"10.0.0.1"}

In the above your MySQL server is on gromit03, and is accepting connections from other hosts; some successful, some not. What kind of information can you gather from the above?

  • You can tell how many connections are being created on your server
  • Where they came from
  • Where ‘root’ connections come from
  • Port scans (see last row) can be identified by no credentials. These don’t have to be port scans per se; any telnet localhost 3006 followed by Ctrl+D will show the same. Typically these would be either load balancer or monitoring tools checks to see that the 3306 port is active.
  • You can tell which accounts connect, and how many times
  • And you can infer which accounts are stale and can be dropped — if an account does not connect within a week’s time, it’s probably stale (pick your own timeframe)

The above is quite interesting on one host; but we have dozens. We’ve installed this plugin on all our MySQL servers, and we use logstash to aggregate them. We aggregate to two destinations: Continue reading » “Introducing audit_login: simple MySQL login logfile based auditing”

Converting an OLAP database to TokuDB, part 2: the process of migration

This is a second in a series of posts describing our experience in migrating a large DWH server to TokuDB. This post discusses the process of migration itself.

As a quick recap (read part 1 here), we have a 2TB compressed InnoDB (4TB uncompressed) based DWH server. Space is running low, and we’re looking at TokuDB for answers. Early experiments show that TokuDB’s compression could make a good impact on disk space usage. I’m still not discussing performance — keeping this till later post.

Those with weak hearts can skip right to the end, where we finally have a complete conversion. You can also peek at the very end to find out how much 4TB uncompressed InnoDB data is worth in TokuDB. But you might want to read through. The process was not smooth, and not as expected (it’s a war story thing). Throughout the migration we got a lot of insight on TokuDB’s behaviour, limitations, conveniences, inconveniences and more.

Disclosure: I have no personal interests and no company interests; throughout the process we were in touch with Tokutek engineers, getting free, friendly & professional advice and providing with input of our own. Most of this content has already been presented to Tokutek throughout the process. TokuDB is open source and free to use, though commercial license is also available.

How do you convert 4TB worth of data to TokuDB?

Obviously one table at a time. But we had another restriction: you may recall I took a live slave for the migration process. And we wanted to end the process with a live slave. So the restriction was: keep it replicating!

How easy would that be? Based on our initial tests, I extrapolated over 20 days of conversion from InnoDB to TokuDB. Even with one table at a time, our largest table was expected to convert in some 12-14 days. Can we retain 14 days of binary logs on a server already running low on disk space? If only I knew then what I know today 🙂 Continue reading » “Converting an OLAP database to TokuDB, part 2: the process of migration”

Three wishes for a new year

Another new year by Jewish calendar. What do I wish for the following year?

  1. World peace
  2. Good health to all
  3. Get auto-vacuuming, disk space reclaiming InnoDB tablespaces

No one likes rebuilding huge InnoDB tables. Rebuilds take time, effort, system resources and loss of sleep. I recently rebuilt a 300GB table to realize it reduced to a mere 45GB. How about some background automation?

My wishes in previous two years [2010], [2011], [2012] have not come true. I’m still willing to settle for two out of three.

Converting an OLAP database to TokuDB, part 1

This is the first in a series of posts describing my impressions of converting a large OLAP server to TokuDB. There’s a lot to tell, and the experiment is not yet complete, so this is an ongoing blogging. In this post I will describe the case at hand and out initial reasons for looking at TokuDB.

Disclosure: I have no personal interests and no company interests; we did get friendly, useful and free advice from Tokutek engineers. TokuDB is open source and free to use, though commercial license is also available.

The case at hand

We have a large and fast growing DWH MySQL setup. This data warehouse is but one component in a larger data setup, which includes Hadoop, Cassandra and more. For online dashboards and most reports, MySQL is our service. We populate this warehouse mainly via Hive/Hadoop. Thus, we have an hourly load of data from Hive, as well as a larger daily load.

There are some updates on the data, but the majority of writes are just mysqlimports of Hive queries.

Usage of this database is OLAP: no concurrency issues here; we have some should-be-fast-running queries issued by our dashboards, as well as ok-to-run-longer queries issued for reports.

Our initial and most burning trouble is with size. Today we use COMPRESSED InnoDB tables (KEY_BLOCK_SIZE is default, i.e. 8). Our data volume sums right now at about 2TB. I happen to know this translates as 4TB of uncompressed data.

However growth of data is accelerating. A year ago we would capture a dozen GB per month. Today it is a 100GB per month, and by the end of this year it may climb to 150GB per month or more.

Our data is not sharded. We have a simple replication topology of some 6 servers. Machines are quite generous as detailed following. And yet, we will be running out of resources shortly: disk space (total 2.7TB) is now running low and is expected to run out in about six months. One of my first tasks in Outbrain is to find a solution to our DWH growth problem. The solution could be sharding; it could be a commercial DWH product; anything that works. Continue reading » “Converting an OLAP database to TokuDB, part 1”