Pseudo GTID

Pseudo GTID is a method to implement a GTID-like solution where slaves are easily connected to one another. This blog post and the following ones will describe work in progress (some 80% completed), where simulation of GTID makes for a good enough basis for refactoring replication topologies. I’m coding this in orchestrator, which already provides a substantial infrastructure support for this.

The final goal: orchestrator will allow you to move a slave below another, using only the data available by those two slaves. The usage is obvious:

  • Easy master failover (master dead? Orchestrator will choose the most advanced slave to promote and make it master of its siblings)
  • Slave promotion in complex topologies (with deep nested topologies, be able to move a slave up the hierarchy even if its local master is corrupted).

This can all happen with your normal, non GTID, MySQL replication, using your normal binary log files & positions.

This work in progress is inspired by Sam Lambert at GitHub, who has worked on a similar solution with different implementation. I also recall discussions with other DBAs having similar solution.

Pseudo GTID

First thing’s first, the basis for proposed solution is a pseudo-GTID. A unique entry in the binary logs (not necessarily sequential; not necessarily in ascending order). While in GTID implementations we have a unique identifier for each entry in the binary log, with pseudo-GTID we accept an occasional (or frequent) unique entry in the binary log.

There are many ways to do so. Certainly a client can generate a unique Id and invoke some statement on MySQL involving that ID. That would serve as valid grounds for the proposed solution. But I like things to be contained within MySQL. Consider, for example, the following event, which would be my preferred choice in Statement Based Replication (for RBR solution, see next post):

drop table if exists test.pseudo_gtid;
create table if not exists test.pseudo_gtid (
  id int unsigned not null primary key,
  ts timestamp,
  gtid varchar(64) charset ascii
);


drop event if exists test.update_pseudo_gtid_event;

delimiter ;;
create event if not exists
  test.update_pseudo_gtid_event
  on schedule every 10 second starts current_timestamp
  on completion preserve
  enable
  do
    begin
      set @pseudo_gtid := uuid();
      insert into test.pseudo_gtid (id, ts, gtid) values (1, NOW(), @pseudo_gtid) on duplicate key update ts=NOW(), gtid=VALUES(gtid);
    end
;;

delimiter ;

The above is based on Making UUID() and RAND() replication safe. What do we get? Once in 10 seconds (or what have you), a unique entry is written to the binary log.

Continue reading » “Pseudo GTID”

Making UUID() and RAND() replication safe

MySQL’s UUID() and RAND() functions both provide with (pseudo) indeterministic result. UUID()‘s result is moreover bound to the host on which it executes. For this reason, both are unsafe to replicate with STATEMENT binlog format. As an example, consider:

master> create table test.uuid_test (id int, u varchar(64));

master> insert into test.uuid_test values (1, UUID());
Query OK, 1 row affected, 1 warning (0.03 sec)

master> select * from test.uuid_test;
+------+--------------------------------------+
| id   | u                                    |
+------+--------------------------------------+
|    1 | 7e3596d8-56ac-11e4-b284-3c970ea31ea8 |
+------+--------------------------------------+

The warning we got on the insert directly relates to the following inconsistency on a slave:

slave1> select * from test.uuid_test;
+------+--------------------------------------+
| id   | u                                    |
+------+--------------------------------------+
|    1 | 7e379d63-56ac-11e4-8477-3c970ea31ea8 |
+------+--------------------------------------+

The data on the slave is clearly inconsistent with the master’s. The slave, replicating via STATEMENT binlog format, re-executes the INSERT command and gets a different UUID value.

External

One solution to the above is to generate the UUID value from your application. By the time MySQL gets the INSERT statement, the UUID value is a constant string, as far as MySQL is concerned.

Internal

However there’s a way to do it from within MySQL, by decoupling the UUID() function from the INSERT statement. It takes a session variable. Consider: Continue reading » “Making UUID() and RAND() replication safe”

orchestrator 1.1.18: new features, support for orchestrator-agent

Outbrain‘s orchestrator 1.1.18 is released:

  • Support for orchestrator-agent (see announcement): agent pages, support for agent actions, initiation of seeds (provisioning of new/corrupted servers), auditing of seeds.
  • Clusters dashboard
  • Support for long query auditing
  • SSL
  • Proxy authentication (e.g. apache2 serving as reverse-proxy with LDAP)
  • User control
  • Better slave moving rules.

Quick links:

orchestrator now allows for seeding/provisioning of servers via orchestrator-agent. It communicates with agents executing on the MySQL hosts and coordinate transfer of data between them. orchestrator now supports invocation and auditing of seeding operations, and protects you from breaking your seeds. The orchestrator-agent is a solution to Outbrain’s specific use case, and may not appeal to the greater crowd. Nonetheless it is extendible and is released as open source. Continue reading » “orchestrator 1.1.18: new features, support for orchestrator-agent”

Announcing orchestrator-agent

orchestrator-agent is a side-kick, complementary project of orchestrator, implementing a daemon service on one’s MySQL hosts which communicates with and accepts commands from orchestrator, built with the original purpose of providing an automated solution for provisioning new or corrupted slaves.

It was built by Outbrain, with Outbrain’s specific use case in mind. While we release it as open source, only a small part of its functionality will appeal to the public (this is why it’s not strictly part of the orchestrator project, which is a general purpose, wide-audience solution). Nevertheless, it is a simple implementation of a daemon, such that can be easily extended by the community. The project is open for pull-requests!

A quick breakdown of orchestrator-agent is as follows:

  • Executes as a daemon on linux hosts
  • Interacts and invokes OS commands (via bash)
  • Does not directly interact with a MySQL server running on that host (does not connect via mysql credentials)
  • Expects a single MySQL service on host
  • Can control the MySQL service (e.g. stop, start)
  • Is familiar with LVM layer on host
  • Can take LVM snapshots, mount snapshots, remove snapshots
  • Is familiar with the MySQL data directory, disk usage, file system
  • Can send snapshot data from a mounted snapshot on a running MySQL host
  • Can prepare data directory and receive snapshot data from another host
  • Recognizes local/remote datacenters
  • Controlled by orchestrator, two orchestrator-agents implement an automated and audited solution for seeding a new/corrupted MySQL host based on a running server.

Continue reading » “Announcing orchestrator-agent”

Get per-object grants via common_schema

Did you know common_schema supports a complete breakdown of all accounts on your database server? It can provide you with the GRANT statements required to set up an account, the REVOKE statements to undo the former, and this can be broken down on a per-object & per-object-type basis. Consider the sql_grants view:

Find who has privileges on a table called t:

select * from common_schema.sql_grants where priv_level_name='table' and object_name='t'\G

           GRANTEE: 'power1'@'localhost'
              user: power1
              host: localhost
        priv_level: `mytst`.`t`
   priv_level_name: table
     object_schema: mytst
       object_name: t
current_privileges: SELECT
      IS_GRANTABLE: NO
         sql_grant: GRANT SELECT ON `mytst`.`t` TO 'power1'@'localhost'
        sql_revoke: REVOKE SELECT ON `mytst`.`t` FROM 'power1'@'localhost'
     sql_drop_user: DROP USER 'power1'@'localhost'

or, who has privileges on the test schema:

select * from common_schema.sql_grants where priv_level_name='schema' and object_name='test' \G

           GRANTEE: 'power1'@'localhost'
              user: power1
              host: localhost
        priv_level: `test`.*
   priv_level_name: schema
     object_schema: NULL
       object_name: test
current_privileges: ALTER, ALTER ROUTINE, CREATE, CREATE ROUTINE, CREATE TEMPORARY TABLES, CREATE VIEW, DELETE, DROP, EVENT, EXECUTE, INDEX, INSERT, LOCK TABLES, REFERENCES, SELECT, SHOW VIEW, TRIGGER, UPDATE
      IS_GRANTABLE: NO
         sql_grant: GRANT ALTER, ALTER ROUTINE, CREATE, CREATE ROUTINE, CREATE TEMPORARY TABLES, CREATE VIEW, DELETE, DROP, EVENT, EXECUTE, INDEX, INSERT, LOCK TABLES, REFERENCES, SELECT, SHOW VIEW, TRIGGER, UPDATE ON `test`.* TO 'power1'@'localhost'
        sql_revoke: REVOKE ALTER, ALTER ROUTINE, CREATE, CREATE ROUTINE, CREATE TEMPORARY TABLES, CREATE VIEW, DELETE, DROP, EVENT, EXECUTE, INDEX, INSERT, LOCK TABLES, REFERENCES, SELECT, SHOW VIEW, TRIGGER, UPDATE ON `test`.* FROM 'power1'@'localhost'
     sql_drop_user: DROP USER 'power1'@'localhost'

           GRANTEE: 'test'@'localhost'
              user: test
              host: localhost
        priv_level: `test`.*
   priv_level_name: schema
     object_schema: NULL
       object_name: test
current_privileges: ALTER, ALTER ROUTINE, CREATE, CREATE ROUTINE, CREATE TEMPORARY TABLES, CREATE VIEW, DELETE, DROP, EVENT, EXECUTE, INDEX, INSERT, LOCK TABLES, REFERENCES, SELECT, SHOW VIEW, TRIGGER, UPDATE
      IS_GRANTABLE: NO
         sql_grant: GRANT ALTER, ALTER ROUTINE, CREATE, CREATE ROUTINE, CREATE TEMPORARY TABLES, CREATE VIEW, DELETE, DROP, EVENT, EXECUTE, INDEX, INSERT, LOCK TABLES, REFERENCES, SELECT, SHOW VIEW, TRIGGER, UPDATE ON `test`.* TO 'test'@'localhost'
        sql_revoke: REVOKE ALTER, ALTER ROUTINE, CREATE, CREATE ROUTINE, CREATE TEMPORARY TABLES, CREATE VIEW, DELETE, DROP, EVENT, EXECUTE, INDEX, INSERT, LOCK TABLES, REFERENCES, SELECT, SHOW VIEW, TRIGGER, UPDATE ON `test`.* FROM 'test'@'localhost'
     sql_drop_user: DROP USER 'test'@'localhost'

In the same manner, you can easily SELECT for all grants that are defined per-table, per-schema, … priv_level_name: is any one of ‘user’, ‘schema’, ‘table’, ‘column’, ‘routine’.

The above is a view, which aggregates data from all relevant INFORMATION_SCHEMA tables, normalizing and de-normalizing as necessary.

You might also want to look at sql_show_grants which provides with a no-breakdown GRANT for all accounts. It’s like a SHOW ALL GRANTS which doesn’t exist, plus it’s SELECTable.

Also, thanks to eval(), the above make for executable code. Consider: Continue reading » “Get per-object grants via common_schema”

Three wishes for a new year

Another new year by Jewish calendar. What do I wish for the following year?

  1. World peace
  2. Good health to all
  3. Multi-core execution for queries

After having evaluated a few columnar databases, and having seen how a single query gets 24 cores busy, I can’t look at MySQL the same way again. The fact that a single query consumes a single core only doesn’t seem right in the year 2014. Shard-query is a cool application-level attempt to solve the above; I would like to see stuff like this implemented inside the server (or inside the storage engine where possible).

None of my wishes in previous years [2010], [2011], [2012], [2013] came true (and mostly gone worse). I’m still willing to settle for two out of three.

zookeepercli: lightweight, powerful, controlled command line client for ZooKeeper

I’m happy to announce the availability of zookeepercli: a lightweight, simple, fast and controlled command line client for ZooKeeper.

zookeepercli allows for:

  • Basic CRUD-like operations: createsetdeleteexistsgetls (aka children).
  • Extended operations: lsr (ls recursive),  creater (create recursively)
  • Well formatted and controlled output: supporting either txt or json format
  • Single, no-dependencies binary file, based on a native Go ZooKeeper library by github.com/samuel/go-zookeeper (LICENSE)

I was dissatisfied with existing command line access to ZooKeeper. Uncontrolled and noisy output as well as large footprint were among the reasons. zookeepercli overcomes the above and provides with often required powers.

Usage samples:


$ zookeepercli --servers srv-1,srv-2,srv-3 -c create /demo_only "path placeholder"
$ zookeepercli --servers srv-1,srv-2,srv-3 -c create /demo_only/key1 "value1"
$ zookeepercli --servers srv-1,srv-2,srv-3 -c create /demo_only/key2 "value2"
$ zookeepercli --servers srv-1,srv-2,srv-3 -c create /demo_only/key3 "value3"

$ zookeepercli --servers srv-1,srv-2,srv-3 -c ls /demo_only
key3
key2
key1

# Same as above, JSON format output:
$ zookeepercli --servers srv-1,srv-2,srv-3 --format=json -c ls /demo_only
["key3","key2","key1"]

$ zookeepercli --servers srv-1,srv-2,srv-3 -c delete /demo_only/key1
$ zookeepercli --servers srv-1,srv-2,srv-3 -c delete /demo_only/key2
$ zookeepercli --servers srv-1,srv-2,srv-3 -c delete /demo_only/key3
$ zookeepercli --servers srv-1,srv-2,srv-3 -c delete /demo_only

# Create a path recursively (auto-generate parent directories if not exist):
$ zookeepercli --servers=srv-1,srv-2,srv-3 -c creater "/demo_only/child/key1" "val1"
$ zookeepercli --servers=srv-1,srv-2,srv-3 -c creater "/demo_only/child/key2" "val2"

$ zookeepercli --servers=srv-1,srv-2,srv-3 -c get "/demo_only/child/key1"
val1

# This path was auto generated due to recursive create:
$ zookeepercli --servers=srv-1,srv-2,srv-3 -c get "/demo_only" 
zookeepercli auto-generated

# ls recursively a path and all sub children:
$ zookeepercli --servers=srv-1,srv-2,srv-3 -c lsr "/demo_only" 
child
child/key1
child/key2 

zookeepercli is released as open source by Outbrain under the Apache 2.0 license.

Quick links:

Orchestrator 1.0.5: refactoring masters, multi-master replication

Outbrain’s orchestrator Version 1.0.5 is released.

Quick links: Orchestrator Manual, FAQ, Downloads

Orchestrator now supports refactoring of masters via master-master topologies. It now allows promoting slaves as co-masters and detachment of instances from a co-master topology, effectively allowing for replacing an active master.

Like this

Drag a master:

orchestator-cm-simple-drag-master-01

Onto one of its slaves: Continue reading » “Orchestrator 1.0.5: refactoring masters, multi-master replication”

Some MySQL security tips

This is a brief list of security tips for MySQL. It is by no means complete.

  • Follow the sudo example. Don’t let all you DBAs and Ops have the password for the root account. Have each and every one of them have their own personal super-duper account, with their own personal and private password. This makes it so easy when someone leaves the company. No need to change passwords, just to remove the employee’s account.
  • Block root. Either remove it completely or forbid it from logging in. Yes, there’s a way hack in MySQL to have a valid account blocked from logging in. One way of making this happen is via common_schema‘s sql_accounts. Here’s how to block root account using common_schema:
mysql> CALL common_schema.eval("SELECT sql_block_account FROM sql_accounts WHERE USER = 'root'");
  • Make lots of small users. Give nagios its own user. Give collectd its own user. Give orchestrator its own user. Give innotop its own user. Give whatever its own user. Yes, it’s more users to create, but you get to have each user as limited in privileges as possible, and you don’t get to wonder why your heartbeat script has SUPER, LOCK and SHUTDOWN privileges.
  • Verify: set @@old_passwords=0; before setting a new password. Make sure your configuration does not specify old_passwords = 1. There is no reason to use “old passwords“. In fact, a 5.6 client will refuse connecting with an “old password”.
  • Give no access to mysql.*. No one should be tampering directly with the mysql system tables.
  • Run oak-security-audit or, if you have common_schema installed (you mean you don’t?), just CALL security_audit(); I can (almost) guarantee you’d be surprised and thankful for the security breakdown. Users without passwords, users sharing same passwords, users with unreasonable privileges, and more… You’ll see them all.
  • If you have web interfaces to your database or some of its aspects (e.g. Anemometer, Propagator, Orchestrator, monitoring, …), protect it via LDAP group or similar. Not everyone who has access to your network needs to see you database. Neither does every employee.