Oracle ACE

I am honored to have been nominated for, and to have received the Oracle ACE award.

Nomination for this award is made by Oracle community members, and in this case those being Oracle employees Keith Larson and Dave Stokes. The award is given by Oracle for my involvement in the Oracle/MySQL community and for my contributions.

While open source involvement is generally done in the mere purpose of sharing knowledge and solutions, recognition plays a role in it. For the most part, one who writes blogs wants them to be read, and one who writes code wants it to be downloaded and tested, which is an elemental type of recognition, and what I aim for.

The recognition given by the Oracle ACE award makes for a wonderful complement, being given by the corporate with whose products I’m involved. It is great to get a surprising “Hey, good work” acknowledgement. Sun/MySQL told me that back in 2009, and I was caught utterly unprepared. I am still unprepared!

Thank you, the community, the people from whom I learn and benefit, anyone who ever reads my blogs, who comment, anyone who tries my code, who provides feedback, the multitude of people writing, sharing, blogging, speaking, coding, fixing, spreading, who make it such a great community.

Thank you, Oracle for this award!

Self throttling MySQL queries

Recap on the problem:

  • A query takes a long time to complete.
  • During this time it makes for a lot of I/O.
  • Query’s I/O overloads the db, making for other queries run slow.

I introduce the notion of self-throttling queries: queries that go to sleep, by themselves, throughout the runtime. The sleep period means the query does not perform I/O at that time, which then means other queries can have their chance to execute.

I present two approaches:

  • The naive approach: for every 1,000 rows, the query sleep for 1 second
  • The factor approach: for every 1,000 rows, the query sleeps for the amount of time it took to iterate those 1,000 rows (effectively doubling the total runtime of the query). Continue reading » “Self throttling MySQL queries”

Impressions from Percona Live, London

Am now back from Percona Live, London. Here are some quick impressions of mine.

The physical location of the conference was excellent, not far from Tower Bridge. The conference area itself was nice and has good atmosphere. Very well organized. Kudos to Percona!

Theme

The official theme for the conference was “Discover the Power of MySQL”. However, looking back at the conference, I think the de-facto theme was “High Availability and Scale Out for MySQL”.

Why do I say that? Look at the sponsors of the event; look at the talks. There were talks by and about Clustrix, Tungsten (parallel replication), Schooner, Choosing a HA solution (by Henrik), Fusion-IO, MySQL Cluster, MySQL in the Cloud, Galera Replication, Scalebase, Spider, MHA, Various HA Solutions (by Oli Sennhauser)…

There were other talks, about all sorts of things. But to me it seemed like scale-out was the big thing, and HA the next one. Especially looking at sponsors booths it looked that way.

People

Of course, it’s no secret that meeting the people is a great fun. I met with familiar faces, and was happy to meet new ones. It’s great to just discuss stuff with people, share opinions.

The after party was fun, and well organized. Kudos to Clustrix who sponsored it!

Business talk

I wasn’t at the conference to sell anything myself, and so was on a relaxed mood. I was happy that whatever sales talk I shared, heard or overheard was highly technical and very open. It was also good to openly discuss possible relationships.

I was in particular pleased to see the popularity some of my tools and ideas were gaining. I could recognize at least two talks where my tools were mentioned; I was delighted to talk to DBAs telling me they were using openark-kit in general, oak-online-alter-table in particular, or some SQL solutions and tricks I offered in my blog. This is really wonderful! To write something having people use it to their benefit — I suppose this is what open source aspires to. It’s great to hear feedback, and better yet to get good feedback.

I also had the pleasure of meeting people after hours, and have good discussions, not necessarily MySQL related.

To top it all, I had free time on Wednesday, where I had a superior visit to the Natural History Museum, which left me in awe.

Contest for Glory: write a self throttling MySQL query

What’s all this about?

I’ve you’ve been to my talk in London, then you’ve already got this as homework. If not, allow me to fill in the details:

I was speaking about MySQL’s programmatic nature in many aspects (best if you read the slides!). We discussed user defined variables, derived tables, ordering constructs, order of evaluation, time suspension and time calculation.

An issue I presented was that of a very long running query. Say it runs for 20 minutes. It kills your I/O. Do you KILL the query or wait? It could terminate in 5 seconds from now, and if you kill it now, you lose everything. But it may yet run for 3 more hours!

We discussed a futuristic feature where the query would gracefully terminate after some designated time. However, futuristic wishes do not help us.

A self throttling query

I suggested the idea for a self throttling query. We know how to throttle writing queries, such as DELETE queries: we break them into small chunks, then work each chunk at a time, setting ourselves to sleep in between chunks. This is how –sleep and –sleep-ratio work in oak-chunk-update. It is how –sleep and –sleep-coef work in pt-archiver.

But can the same be done for SELECT queries? Continue reading » “Contest for Glory: write a self throttling MySQL query”

Slides from my talk: “Programmatic Queries: things you can code with SQL”

Here are the slides from my talk Programmatic Queries: things you can code with SQL held on October 25th on the Percona Live event, London.

Programmatic Queries: PDF

I wish to thank those who attended my talk.

The topic of the talk was irregular and, to some extent, controversial. Should one really rely on internal implementation for optimizing her queries? Sergei Golubchik was quick to suggest that even in current and coming versions of MariaDB, some things I spoke about may not hold true. I accept.

I was in urge to complete my talk within the time frame. I was happy to hear later on that my talk was well received.

There were homework to my talk. I thought I would make some contest picking the best answers, stay tuned.

Test-driven SQL development

I’m having a lot of fun writing common_schema, an SQL project which includes views, tables and stored routines.

As the project grows (and it’s taking some interesting directions, in my opinion) more dependencies are being introduced, and a change to one routine or view may affect many others. This is why I’ve turned the development on common_schema to be test driven.

Now, just how do you test drive an SQL project?

Well, much like the way you test any other project in your favorite programming language. If its functions you’re testing, that’s all too familiar: functions get some input and provide some output. Hmmm, they might be changing SQL data during that time. With procedures it’s slightly more complex, since they do not directly return output but result sets.

Here’s the testing scheme I use: Continue reading » “Test-driven SQL development”

Speaking on Percona Live, London: “Programmatic Queries: things you can code with SQL”

I’ll be speaking at the Percona Live event, held in London, October 24, 25, 2011.

My session is called Programmatic Queries: things you can code with SQL. It’s a short 30 minute talk, in which I present underlying knowledge of the programmatic nature of SQL queries within MySQL, and how to take advantage of such knowledge so as to build faster, shorter, and sometimes unexpected queries.

This is not about stored routine programming, a classic programmatic aspect of MySQL, but rather about expected order of execution: of row evaluation, of control flow statements, of table inference, of time issues.

I have far too many examples, some real-world problem solvers, and some less common in daily use, to be able to deliver them all on this session. I will pick up those which seem most interesting to me, or those best presenting the programmatic nature of the query. As time allows I may add more examples, or look into interesting future possibilities.

I hope to see you there.

Three wishes for a new year

It’s another new year by Jewish calendar. And what do I wish for in the following year?

  1. World peace
  2. Good health to all
  3. Have some way to turn SHOW commands into SELECT statements, server side. I’m fervently trying to hack around this. Stored routines, export/import from file, text manipulation, I don’t care! I want to SELECT seconds_behind_master somehow. Without plugins.

PS, none of my last year’s wishes came true. I’ll settle for two out of three.

common_schema rev. 68: eval(), processlist_grantees, candidate_keys, easter_day()

Revision 68 of common_schema is out, and includes some interesting features:

  • eval(): Evaluates the queries generated by a given query
  • match_grantee(): Match an existing account based on user+host
  • processlist_grantees: Assigning of GRANTEEs for connected processes
  • candidate_keys: Listing of prioritized candidate keys: keys which are UNIQUE, by order of best-use.
  • easter_day(): Returns DATE of easter day in given DATETIME’s year.

Let’s take a slightly closer look at these:

eval()

I’ve dedicated this blog post on MySQL eval() to describe it. In simple summary: eval() takes a query which generates queries (most common use queries on INFORMATION_SCHEMA) and auto-evaluates (executes) those queries. Read more

match_grantee()

As presented in Finding CURRENT_USER for any user, I’ve developed the algorithm to match a connected user+host details (as presented with PROCESSLIST) with the grantee tables (i.e. the mysql.user table), in a manner which simulates the MySQL server account matching algorithm.

This is now available as a stored function: given a user+host, the function returns with the best matched grantee. Read more

processlist_grantees

This view relies on the above, and maps the entire PROCESSLIST onto GRANTEEs. The view maps each process onto the GRANTEE (MySQL account) which is the owner of that process. Surprisingly, MySQL does not provide one with such information. Continue reading » “common_schema rev. 68: eval(), processlist_grantees, candidate_keys, easter_day()”

MySQL eval()

I’ve just implemented an eval() call for MySQL. It is implemented with SQL, using a stored procedure. So this is not some plugin: you can use it from within your normal database server.

Just what is an eval() call?

In some programming languages it would mean: get some text, and execute it as though it were complied code. So, dynamic coding.

In SQL: get the text of query which generates SQL statements in itself (either DML or DDL), and invoke those implied SQL statements.

A simple example

Best if I present Mass killing of MySQL Connections by Peter Zaitsev. The thing is to execute a query, typically on INFORMATION_SCHEMA, which uses metadata so as to generate SQL queries/commands. Peter’s example is:

select concat('KILL ',id,';') from information_schema.processlist where user='root'

The above query generates KILL commands for all users called ‘root’. I do many such queries in common_schema: like creating the GRANT statements for accounts, the DROP KEY statements for redundant keys, the ADD and DROP statements for foreign keys etc.

So the problem is you have to export those statements to file, then execute them from file: either using SOURCE, as in Peter’s example, or from shell prompt, piping file contents into mysql client.

You can now eval()

I’ve been on family holiday for a couple of weeks, which meant no need to think of work. Which means more time to think of SQL (darn!). And I’ve found the way to do it completely within the server (no external files required). Continue reading » “MySQL eval()”