On stored routines and dynamic statements

I very much enjoyed reading Overloading Procedures by Michael McLaughlin: good stuff!

I’m dealing with similar issues in common_schema/QueryScript, where I implement a whole new scripting language within MySQL, interpreted by stored routines. I am now finalizing the next version of common_schema/QueryScript, with a major addition to the scripting language to put yet even more power at the hands of the programmer/DBA using simple, clean syntax.

Still hush hush, the development of that feature touched at the very same issues described in Michael’s post. Present in current release, these issues are intensified by the use and complexity of the new development. Here are a few insights of mine:

Internal array implementation

Like Michael, I started by implementing arrays through tables. That is, create a (temporary, in my case) table, wrap it up with a lot of stored routine code, and simulate an array. This array is not yet provided to the user, but is used internally for QueryScript’s own code.

Well, disappointment here: during load tests on intense structures, such as a foreach loop, where each iteration of the loop requires the creation of an array, I found that the current solution does not hold well on busy servers.

Seemingly, there’s nothing wrong with the creation of a new table every once in a while — and in particular a temporary table. However, I quickly found out that a busy server thrashes the table cache with such intense rate of creation/dropping of tables. The competition over the table cache mutex becomes intolerable and hogs not only the script’s execution but the entire server’s.

There’s also the issue of the type of array values — no going around using textual columns, of course, but — how long? A VARCHAR(32767) should be enough for any reasonable implementation, but — how much memory would that consume? Both MEMORY and standard temporary tables (Percona Server has that partially resolved) use a fixed row format, which means a 32K text is actually allocated in memory even when your value is ‘x’. Continue reading » “On stored routines and dynamic statements”

Auto caching tables

Is there a way to create a caching table, some sort of a materialized view, such that upon selecting from that table, its data is validated/invalidated?

Hint: yes.

But to elaborate the point: say I have some table data_table. Can I rewrite all my queries which access data_table to read from some autocache_data_table, but have nothing changed in the query itself? No caveats, no additional WHEREs, and still have that autocache_data_table provide with the correct data, dynamically updated by some rule of our choice?

And: no crontab, no event scheduler, and no funny triggers on data_table? In such way that invalidation/revalidation occurs upon SELECT?

Well, yes.

This post is long, but I suggest you read it through to understand the mechanism, it will be worthwhile.

Background

The following derives from my long research on how to provide better, faster and safer access to INFORMATION_SCHEMA tables. It is however not limited to this exact scenario, and in this post I provide with a simple, general purpose example. I’ll have more to share about INFORMATION_SCHEMA specific solutions shortly.

I was looking for a server side solution which would not require query changes, apart from directing the query to other tables. Solution has to be supported by all standard MySQL installs; so: no plugins, no special rebuilds. Continue reading » “Auto caching tables”

Documentation in SQL: CALL for help()

Documentation is an important part of any project. On the projects I maintain I put a lot of effort on documentation, and, frankly, the majority of time spent on my projects is on documentation.

The matter of keeping the documentation faithful is a topic of interest. I’d like to outline a few documentation bundling possibilities, and the present the coming new documentation method for common_schema. I’ll talk about any bundling that is NOT man pages.

High level: web docs

This is the initial method of documentation I used for openark kit and mycheckpoint. It’s still valid for mycheckpoint. Documentation is web-based. You need Internet access to read it. It’s in HTML format.

Well, not exactly HTML format: I wrote it in WordPress. Yes, it’s HTML, but there’s a lot of noise around (theme, menus, etc.) which is not strictly part of the documentation.

While this is perhaps the easiest way to go, here’s a few drawbacks: Continue reading » “Documentation in SQL: CALL for help()”

common_schema, rev. 178: foreach(), repeat_exec(), Roland Bouman, query analysis

common_schema, revision 178 is now released, with major additions. This revision turns common_schema into a framework, rather than a set of views and functions.

common_schema provides with query scripting, analysis & informational views, and a function library, allowing for easier administration and diagnostics for MySQL. It introduces SQL based tools which simplify otherwise complex shell and client scripts, allowing the DBA to be independent of operating system, installed packages and dependencies.

There’s no Perl nor Python, and no dependencies to install. It’s just a schema.

Some highlights for the new revision:

  • foreach(), aka $(): loop through a collection, execute callback commands per element.
  • repeat_exec(): a repeat-until device: execute queries until some condition holds.
  • exec_file(): execute files a-la SOURCE, but on server side
  • Query analysis: analyze query text, view or routine definitions to detect dependency objects.
  • Improvements to views and routines, new routines introduced.

Let’s take a closer look:

rpbouman

I’m very happy to have Roland Bouman working on this project. He introduced some sophisticated code without which some functionality could not take place. I’m sure I don’t need to introduce his great capabilities; I’ll just pass the note that it is very good working with him!

foreach()

Introducing a looping device which can iterate a collection and execute callback commands.

What’s a collection? A range of numbers; a set of constants; the result set of a SELECT query; tables in your database and more.

What is a callback? A query or set of queries to invoke on the specific elements in the collection. For example:

call foreach('table in sakila', 'ALTER TABLE ${schema}.${table} ENGINE=InnoDB ROW_FORMAT=COMPRESSED');

I’ll publish dedicated posts on foreach(), aka $(), following this post. Official documentation is here.

repeat_exec()

Repeat executing queries in a given interval, until some condition holds.

What kind of condition? You can loop forever, or until a given time has passed, a given number of iteration has passed. Continue reading » “common_schema, rev. 178: foreach(), repeat_exec(), Roland Bouman, query analysis”

Self throttling MySQL queries

Recap on the problem:

  • A query takes a long time to complete.
  • During this time it makes for a lot of I/O.
  • Query’s I/O overloads the db, making for other queries run slow.

I introduce the notion of self-throttling queries: queries that go to sleep, by themselves, throughout the runtime. The sleep period means the query does not perform I/O at that time, which then means other queries can have their chance to execute.

I present two approaches:

  • The naive approach: for every 1,000 rows, the query sleep for 1 second
  • The factor approach: for every 1,000 rows, the query sleeps for the amount of time it took to iterate those 1,000 rows (effectively doubling the total runtime of the query). Continue reading » “Self throttling MySQL queries”

Test-driven SQL development

I’m having a lot of fun writing common_schema, an SQL project which includes views, tables and stored routines.

As the project grows (and it’s taking some interesting directions, in my opinion) more dependencies are being introduced, and a change to one routine or view may affect many others. This is why I’ve turned the development on common_schema to be test driven.

Now, just how do you test drive an SQL project?

Well, much like the way you test any other project in your favorite programming language. If its functions you’re testing, that’s all too familiar: functions get some input and provide some output. Hmmm, they might be changing SQL data during that time. With procedures it’s slightly more complex, since they do not directly return output but result sets.

Here’s the testing scheme I use: Continue reading » “Test-driven SQL development”

MySQL eval()

I’ve just implemented an eval() call for MySQL. It is implemented with SQL, using a stored procedure. So this is not some plugin: you can use it from within your normal database server.

Just what is an eval() call?

In some programming languages it would mean: get some text, and execute it as though it were complied code. So, dynamic coding.

In SQL: get the text of query which generates SQL statements in itself (either DML or DDL), and invoke those implied SQL statements.

A simple example

Best if I present Mass killing of MySQL Connections by Peter Zaitsev. The thing is to execute a query, typically on INFORMATION_SCHEMA, which uses metadata so as to generate SQL queries/commands. Peter’s example is:

select concat('KILL ',id,';') from information_schema.processlist where user='root'

The above query generates KILL commands for all users called ‘root’. I do many such queries in common_schema: like creating the GRANT statements for accounts, the DROP KEY statements for redundant keys, the ADD and DROP statements for foreign keys etc.

So the problem is you have to export those statements to file, then execute them from file: either using SOURCE, as in Peter’s example, or from shell prompt, piping file contents into mysql client.

You can now eval()

I’ve been on family holiday for a couple of weeks, which meant no need to think of work. Which means more time to think of SQL (darn!). And I’ve found the way to do it completely within the server (no external files required). Continue reading » “MySQL eval()”

Announcing common_schema: common views & routines for MySQL

Today I have released common_schema, a utility schema for MySQL which includes many views and functions, and is aimed to be installed on any MySQL server.

What does it do?

There are views answering for all sorts of useful information: stuff related to schema analysis, data dimensions, monitoring, processes & transactions, security, internals… There are basic functions answering for common needs.

Some of the views/routines simply formalize those queries we tend to write over and over again. Others take the place of external tools, answering complex questions via SQL and metadata. Still others help out with SQL generation.

Here are a few highlights:

There’s more. Take a look at the common_schema documentation for full listing. And it’s evolving: I’ve got quite a few ideas already for future components.

Some of these views rely on heavyweight INFORMATION_SCHEMA tables. You should be aware of the impact and risks.

What do I need to install?

There’s no script or executable file. It’s just a schema. The distribution in an SQL file which generates common_schema. Much like a dump file.

Continue reading » “Announcing common_schema: common views & routines for MySQL”

ROUTINE_PRIVILEGES implementation

Following up on MySQL security: inconsistencies, and on MySQL bug #61596, I was thinking it may take a long time till the non-existent ROUTINE_PRIVILEGES view is implemented. Here’s my own implementation of the view.

I’ve followed the somewhat strange conventions used in the *_PRIVILEGES tables in INFORMATION_SCHEMA, where the IS_GRANTABLE is a separate column, although in 2nd 1st normal form.

I present it here as a query, using session variables, rather than a view definition: Continue reading » “ROUTINE_PRIVILEGES implementation”

Views: better performance with condition pushdown

Justin’s A workaround for the performance problems of TEMPTABLE views post on mysqlperformanceblog.com reminded me of a solution I once saw on a customer’s site.

The customer was using nested views structure, up to depth of some 8-9 views. There were a lot of aggregations along the way, and even the simplest query resulted with a LOT of subqueries, temporary tables, and vast amounts of data, even if only to return with a couple of rows.

While we worked to solve this, a developer showed me his own trick. His trick is now impossible to implement, but there’s a hack around this.

Let’s use the world database to illustrate. Look at the following view definition: Continue reading » “Views: better performance with condition pushdown”