gh-ost – code.openark.org http://shlomi-noach.github.io/blog/ Blog by Shlomi Noach Mon, 11 May 2020 06:28:06 +0000 en-US hourly 1 https://wordpress.org/?v=5.3.3 32412571 Using dbdeployer in CI tests https://shlomi-noach.github.io/blog/mysql/using-dbdeployer-in-ci-tests https://shlomi-noach.github.io/blog/mysql/using-dbdeployer-in-ci-tests#respond Tue, 20 Feb 2018 07:29:58 +0000 https://shlomi-noach.github.io/blog/?p=7848 I was very pleased when Giuseppe Maxia (aka datacharmer) unveiled dbdeployer in his talk at pre-FOSDEM MySQL day. The announcement came just at the right time. I wish to briefly describe how we use dbdeployer (work in progress).

The case for gh-ost

A user opened an issue on gh-ost, and the user was using MySQL 5.5. gh-ost is being tested on 5.7 where the problem does not reproduce. A discussion with Gillian Gunson raised the concern of not testing on all versions. Can we run gh-ost tests for all MySQL/Percona/MariaDB versions? Should we? How easy would it be?

gh-ost tests

gh-ost has three different test types:

  • Unit tests: these are plain golang logic tests which are very easy and quick to run.
  • Integration tests: the topic of this post, see following. Today these do not run as part of an automated CI testing.
  • System tests: putting our production tables to the test, continuously migrating our production data on dedicated replicas, verifying checksums are identical and data is intact, read more.

Unit tests are already running as part of automated CI (every PR is subjected to those tests). Systems tests are clearly tied to our production servers. What’s the deal with the integration tests?

gh-ost integration tests

The gh-ost integration tests are a suite of scenarios which verify gh-ost‘s operation is sound. These scenarios are mostly concerned with data types, special alter statements etc. Is converting DATETIME to TIMESTAMP working properly? Are latin1 columns being updated correctly? How about renaming a column? Changing a PRIMARY KEY? Column reorder? 5.7 JSON values? And so on. Each test will recreate the table, run migration, stop replication, check the result, resume replication…

The environment for these tests is a master-replica setup, where gh-ost modifies on the table on the replica and can then checksum or compare both the original and the altered ghost table.

We develop gh-ost internally at GitHub, but it’s also an open source project. We have our own internal CI environment, but then we also wish the public to have visibility into test failures (so that a user can submit a PR and get a reliable automated feedback). We use Travis CI for the public facing tests.

To run gh-ost‘s integration tests as described above as part of our CI tests we should be able to:

  • Create a master/replica setup in CI.
  • Actually, create a master/replica setup in any CI, and namely in Travis CI.
  • Actually, create multiple master/replica setups, of varying versions and vendors, in any ci, including both our internal CI and Travis CI.

I was about to embark on a MySQL Sandbox setup, which I was not keen on. But FOSDEM was around the corner and I had other things to complete beforehand. Lucky me, dbdeplyer stepped in.

dbdeployer

dbdeployer is a rewrite, a replacement to MySQL Sandbox. I’ve been using MySQL Sandbox for many years, and my laptop is running two sandboxes at this very moment. But MySQL Sandbox has a few limitations or complications:

  • Perl. Versions of Perl. Dependencies of packages of Perl. I mean, it’s fine, we can automate that.
  • Command line flag complexity: I always get lost in the complexity of the flags.
  • Get it right or prepare for battle: if you deployed something, but not the way you wanted, there’s sometimes limbo situations where you cannot re-deploy the same sandbox again, or you should start deleting files everywhere.
  • Deploy, not remove. Adding a sandbox is one thing. How about removing it?

dbdeployer is a golang rewrite, which solves the dependency problem. It ships as a single binary and nothing more is needed. It is simple to use. While it generates the equivalence of a that of a MySQL Sandbox, it does so with less command line flags and less confusion. There’s first class handling of the MySQL binaries: you unpack MySQL tarballs, you can list what’s available. You can then create sandbox environments: replication, standalone, etc. You can then delete those.

It’s pretty simple and I have not much more to add — which is the best thing about it.

So, with dbdeployer it is easy to create a master/replica. Something like:

dbdeployer unpack path/to/5.7.21.tar.gz --unpack-version=5.7.21 --sandbox-binary ${PWD}/sandbox/binary
dbdeployer replication 5.7.21 --nodes 2 --sandbox-binary ${PWD}/sandbox/binary --sandbox-home ${PWD}/sandboxes --gtid --my-cnf-options log_slave_updates --my-cnf-options log_bin --my-cnf-options binlog_format=ROW

Where does it all fit in, and what about the MySQL binaries though?

So, should dbdeployer be part of the gh-ost repo? And where does one get those MySQL binaries from? Are they to be part of the gh-ost repo? Aren’t they a few GB to extract?

Neither dbdeployer nor MySQL binaries should be added to the gh-ost repo. And fortunately, Giuseppe also solved the MySQL binaries problem.

The scheme I’m looking at right now is as follows:

  • A new public repo, gh-ost-ci-env is created. This repo includes:
    • dbdeployer compiled binaries
    • Minimal MySQL tarballs for selected versions. Those tarballs are reasonably small: between `14MB` and `44MB` at this time.
  • gh-ost‘s CI to git clone https://github.com/github/gh-ost-ci-env.git (code)
  • gh-ost‘s CI to setup a master/replica sandbox (one, two).
  • Kick the tests.

The above is a work in progress:

  • At this time only runs a single MySQL version.
  • There is a known issue where after a test, replication may take time to resume. Currently on slower boxes (such as the Travis CI containers) this leads to failures.

Another concern I have at this time is build time. For a single MySQL version, it takes some 5-7 minutes on my local laptop to run all integration tests. It will be faster on our internal CI. It will be considerably slower on Travis CI, I can expect between 10m - 15m. Add multiple versions and we’re looking at a 1hr build. Such long build times will affect our development and delivery times, and so we will split them off the main build. I need to consider what the best approach is.

That’s all for now. I’m pretty excited for the potential of dbdeployer and will be looking into incorporating the same for orchestrator CI tests.

 

 

]]>
https://shlomi-noach.github.io/blog/mysql/using-dbdeployer-in-ci-tests/feed 0 7848
gh-ost 1.0.42 released: JSON support, optimizations https://shlomi-noach.github.io/blog/mysql/gh-ost-1-0-42-released-json-support-optimizations https://shlomi-noach.github.io/blog/mysql/gh-ost-1-0-42-released-json-support-optimizations#comments Thu, 14 Sep 2017 07:26:16 +0000 https://shlomi-noach.github.io/blog/?p=7799 gh-ost 1.0.42 is released and available for download.

JSON

MySQL 5.7’s JSON data type is now supported.

There is a soft-limitation, that your JSON may not be part of your PRIMARY KEY. Currently this isn’t even supported by MySQL anyhow.

Performance

Two noteworthy changes are:

  • Client side prepared statements reduce network traffic and round trips to the server.
  • Range query iteration avoids creating temporary tables and filesorting.

We’re not running benchmarks at this time to observe performance gains.

5.7

More tests validating 5.7 compatibility (at this time GitHub runs MySQL 5.7 in production).

Ongoing

Many other changes included.

We are grateful for all community feedback in form of open Issues, Pull Requests and questions!

gh-ost is authored by GitHub. It is free and open source and is available under the MIT license.

Speaking

In two weeks time, Jonah Berquist will present gh-ost: Triggerless, Painless, Trusted Online Schema Migrations at Percona Live, Dublin.

Tom Krouper and myself will present MySQL Infrastructure Testing Automation at GitHub, where, among other things, we describe how we test gh-ost in production.

]]>
https://shlomi-noach.github.io/blog/mysql/gh-ost-1-0-42-released-json-support-optimizations/feed 4 7799
Speaking at August Penguin, MySQL Track, GitHub sponsored https://shlomi-noach.github.io/blog/mysql/speaking-at-august-penguin-mysql-track-github-sponsored https://shlomi-noach.github.io/blog/mysql/speaking-at-august-penguin-mysql-track-github-sponsored#respond Sun, 03 Sep 2017 10:42:05 +0000 https://shlomi-noach.github.io/blog/?p=7772 This Thursday I’ll be presenting at August Penguin, conveniently taking place September 7th, 8th, Ramat Gan, Israel.

I will be speaking as part of the MySQL track, 2nd half of Thursday. The (Hebrew) schedule is here.

My talk is titled Reliable failovers, safe schema migrations: open source solutions to MySQL problems. I will describe some of the open source MySQL infrastructure work we run at GitHub ; how it solves reliability, availability and usability. I’ll describe some of our internal workflows and our use of chat and chatops.

I’m proud to announce GitHub sponsors the event. We won’t have a booth, but please do grab me in the hallways or over lunch to chat!

And, yes, octocat stickers will be made available 🙂

 

]]>
https://shlomi-noach.github.io/blog/mysql/speaking-at-august-penguin-mysql-track-github-sponsored/feed 0 7772
Discussing online schema migrations with Oracle’s MySQL engineering managers https://shlomi-noach.github.io/blog/mysql/discussing-online-schema-migrations-with-oracles-mysql-engineering-managers https://shlomi-noach.github.io/blog/mysql/discussing-online-schema-migrations-with-oracles-mysql-engineering-managers#comments Wed, 23 Nov 2016 13:23:18 +0000 https://shlomi-noach.github.io/blog/?p=7646 Last week I had the pleasant opportunity of introducing and discussing the operation of online schema migrations to MySQL’s engineering managers, as part of their annual meeting, in London.

Together with Simon J. Mudd of Booking.com, we discussed our perception of what it takes to run online schema migrations on a live, busy system.

While the Oracle/MySQL engineers develop new features or optimize behavior in the MySQL, we of the industry have the operational expertise and understanding of the flow of working with MySQL. In all topics, and in schema migration in particular, there is a gap between what’s perceived to be the use case and what the use case actually is. It is the community’s task to provide feedback back to Oracle so as to align development to match operations need where possible.

Our meeting included the following:

Need for schema migrations

We presented, based on our experience in current and past companies, and based on our friends of the community’s experience, the case for online schema migrations. At GitHub, at Booking.com and in many other companies I’m familiar with, we continuously deploy to production, and this implies continuous schema migrations to our production databases. We have migrations running daily; sometimes multiple per day, some time none.

With continuous deployment, we as Guardians of the Database do not wish to be blockers for the development cycle. On the contrary, we want to be out of the way as soon as possible, other than verifying a requested migration is safe. We wish to be able to deliver a migration at any given time.

Not all companies behave this way; some run a weekly aggregation of migrations. Others yet still use the Though Shall Not Pass DBA model. We tried to depict the various approaches with strong emphasis on our own approach, which is the most demanding of schema migration solutions.

The MySQL ALTER

We proceeded to discuss the in-house ALTER statement & InnoDB online DDL, and pointed out the limitations those impose on “online” operations to the effect of rendering these solutions unused by many. The serialization in replication stream means losing serving capacity, getting lagging replicas. The lack of escape path means a commitment into an hours worth of uninterruptible operation. The lack of resource control implies getting performance degraded throughout the operation.

We briefly touched on the TokuDB’s ALTER and how it worked.

Replication solutions

We discussed migrating via replication: running migrations on one or more replicas at a time, finally failing over onto a promoted replica once all replicas are updated.

We know this solution to be in use in companies such as DropBox, Etsy and others. We illustrated our own reasoning for not using this solution:

  • Increased clock-time for running a migration: running a one-replica-at-a-time or few-replicas-at-a-time can double, triple, quadruple and so forth the overall migration time.
  • Concurrent migration complexity: and since runtime increases, so does the likelihood of needing to run additional migration at the same time, which highly complicates the flow in a one-at-a-time or few-at-a-time model.
  • Serving capacity: in this model some, or up to half the number of servers, are non operational. Serving capacity is reduced and we need to have more hardware to support that
  • Failover: the failover is not smooth; it either includes some outage or some block time, and at any case noticeable in production. Having a planned failover once in a while is OK, but having a failover multiple times a day is too much of a hustle, in our current setup.
  • Topology complexity: how our topologies always have some special cases, such as cross-DC replication with reduced cross-DC network traffic via intermediate masters, testing replicas with newer versions, developer-dedicated servers and others, that make shuffling of replicas around difficult to automate.

We have not discussed Galera’s Rolling Schema Upgrades as we personally do not have the experience of working with it. It solves the failover issue above, but given a “normal” replication tree under the cluster, same problems as above apply.

We concluded with our personal take, that like everything else, we just like to write stuff directly onto our masters, and let the natural replication flow deal with it and get our entire topology to be consistent.

Existing trigger based migrations

We drilled down into the algorithms behind pt-online-schema-change and Facebook’s OSC (the latter being rewritten today, not yet released as open source). We elaborated on the pains we saw in trigger based migrations: being unsuspendible, causing lock spaghetti, impacting write latency on the master to the point of a standstill on busy servers, being untestable.

gh-ost

I presented  gh-ost, our own, triggerless take on schema migrations. I discussed the logic behind gh-ost and how it decouples migration load from production load; the low impact the triggerless migration has on the master and on the entire replication chain, leading to low, subsecond replication lags throughout the migration and eliminating locking contention on the master. Basically the presentation Tom Krouper and I gave at Percona Live Amsterdam.

Want to Have

We followed up by a list of feature requests we could enjoy. These were largely technical issues gh-ost would benefit from, simplifying its behavior or ensuring its correctness in complex cases. We discussed dropping tables at end of migration, getting more info in the binary logs, GTID issues and more.

Acknowledgements

Thank you to Morgan Tocker for officially inviting us to this gathering. There were quite a few familiar faces in the room, and it was a friendly gathering. Thank you to all the engineering managers with whom we met!

The discussion was lively, friendly and receptive. The Oracle engineers laid out the internals of the online DDL; some of their thoughts on the potential of the JSON format; gave advice on technical issues presented. I’d like to thank them for listening to our take on the subject. There was a discussion on the possible paths Oracle can take to improve online schema operations, and I’d like to thank Oracle for sharing their own thoughts and advice!

]]>
https://shlomi-noach.github.io/blog/mysql/discussing-online-schema-migrations-with-oracles-mysql-engineering-managers/feed 1 7646
Three wishes for a new year https://shlomi-noach.github.io/blog/mysql/three-wishes-for-a-new-year-4 https://shlomi-noach.github.io/blog/mysql/three-wishes-for-a-new-year-4#comments Wed, 28 Sep 2016 14:20:54 +0000 https://shlomi-noach.github.io/blog/?p=7643 (Almost) another new year by Jewish calendar. What do I wish for the following year?

  1. World peace
  2. Good health to all
  3. Relaxed GTID constraints

I’m still not using GTID, and still see operational issues with working with GTID. As a latest example, our new schema migration solution, gh-ost, allows us to test migrations in production, on replicas. The GTID catch? gh-ost has to write something to the binary log. Thus, it “corrupts” the replica with a bogus GTID entry that will never be met in another server, thus making said replica unsafe to promote. We can work around this, but…

I understand the idea and need for the Executed GTID Set. It will certainly come in handy with multi-writer InnoDB Cluster. However for most use cases GTID poses a burden. The reason is that our topologies are imperfect, and we as humans are imperfect, and operations are most certainly imperfect. We may wish to operate on a replica: test something, by intention or mistake. We may wish to use a subchain as the seed for a new cluster split. We may wish to be able to write to downstream replicas. We may use a 3rd party tool that issues a flush tables with read lock without disabling sql_log_bin. Things just happen.

For that, I would like to suggest GTID control levels, such as:

  1. Strict: same as Oracle’s existing implementation. Executed sets, purged sets, whatnot.
  2. Last executed: a mode where the only thing that counts is the last executed GTID value. If I repoint replica, all it needs to check is “hey this is my last executed GTID entry, give me the coordinates of yours. And, no, I don’t care about comparing executed and purged sets, I will trust you and keep running from that point on”
  3. Declarative: GTIDs are generated, are visible in each and every binary log entry, but are completely ignored.

I realize Oracle MySQL GTID is out for some over 3 years now, but I’m sorry – I still have reservations and see use cases where I fear it will not serve me right.

How about my previous years wishes? World peace and good health never came through, however:

  • My 2015 wish for “decent, operations friendly built in online table refactoring” was unmet, however gh-ost is a thing now and exceeds my expectations. No, really. Please come see Tom & myself present gh-ost and how it changed our migration paradigm.
  • My 2012 wish for “decent, long waited for, implementation of Window Functions (aka Analytic Functions) for MySQL” was met by MariaDB’s window functions.
    Not strictly Window Functions, but Oracle MySQL 8.0 will support CTE (hierarchial/recursive), worth a mention.

See you in Amsterdam!

]]>
https://shlomi-noach.github.io/blog/mysql/three-wishes-for-a-new-year-4/feed 2 7643
gh-ost 1.0.17: Hooks, Sub-second lag control, Amazon RDS and more https://shlomi-noach.github.io/blog/mysql/gh-ost-1-0-17-hooks-sub-second-lag-control-amazon-rds-and-more https://shlomi-noach.github.io/blog/mysql/gh-ost-1-0-17-hooks-sub-second-lag-control-amazon-rds-and-more#respond Tue, 06 Sep 2016 09:44:14 +0000 https://shlomi-noach.github.io/blog/?p=7632 gh-ost version 1.0.17 is now released, with various additions and fixes. Here are some notes of interest:

Hooks

gh-ost now supports hooks. These are your own executables that gh-ost will invoke at particular points of interest (validation pass, about to cut-over, success, failure, status, etc.)

gh-ost will set various environment variables for your executables to pick up, passing along such information as migrated/ghost table name, elapsed time, processed rows, migrated host etc.

Sub-second lag control

At GitHub we’re very strict about replication lag. We keep it well under 1 second at most times. gh-ost can now identify sub-second lag on replicas (well, you need to supply with the right query). Our current production migrations are set by default with --max-lag-millis=500 or less, and our most intensive migrations keep replication lag well below 1sec or even below 500ms

No SUPER

The SUPER privilege is required to set global binlog_format='ROW' and for STOP SLAVE; START SLAVE;

If you know your replica has RBR, you can pass --assume-rbr and skips those steps.

RDS

Hooks + No Super = RDS, as seems to be the case. For --test-on-replica you will need to supply your own gh-ost-on-stop-replication hook, to stop your RDS replica at cut-over phase. See this tracking issue

master-master

While active-active are still not supported, you now have greater control over master-master topologies by being able to explicitly pick your master (as gh-ost arbitrarily picks one of the co-masters). Do so by passing --assume-master-host. See cheatsheet.

tungsten replicator

Similarly, gh-ost cannot crawl your tungsten topology, and you are able to specify --tungsten --assume-master-host=the.master.com. See cheatsheet.

Concurrent-rowcount

--exact-rowcount is awesomeness, keeping quite accurate estimate of progress. With --concurrent-rowcount we begin migration with a rough estimate, and execute select count(*) from your_table in parallel, updating our estimate later on throughout the migration

Stricter, safer

gh-ost works in STRICT_ALL_TABLES mode, meaning it would fail rather than set the wrong value to a column.

In addition to unit-testing and production continuous test, a set of local tests is growing, hopefully to run as CI tests later on.

Fixed problems

Fixed time_zone related bug, high unsigned values bug; added strict check for triggers, relaxed config file parsing, and more. Thank you to community contributors for PRs, from ipv6 to typos!

Known issues

Issues coming and going at all times — thank you for reporting Issues!

We have a confirmed bug with non-UTF charsets at this time. Some other minor issues and feature requests are open — we’ll take them as we go along.

Feedback requests

We are not testing gh-ost on RDS ourselves. We appreciate community feedback on this tracking issue.

We are not testing gh-ost on Galera/XtraDB cluster ourselves. We appreciate community feedback on this tracking issue.

We value submitted Issues and questions.

Speaking

We will be presenting gh-ost in the next month:

Hope to see you there, and thank you again to all contributors!

]]>
https://shlomi-noach.github.io/blog/mysql/gh-ost-1-0-17-hooks-sub-second-lag-control-amazon-rds-and-more/feed 0 7632