Generating numbers out of seemingly thin air

In some of my previous posts I’ve used a numbers table, like one holding values 1, 2, 3, …, 255. Such table can be used for string walking, joining with other tables, performing iterations.

The existence of number tables has always been a little pain. Yes, they’re very, very simple, but they need to be there. So if you just need to script some SQL query, you may find that you need to create such tables. Ummm… this means you need to have privileges (at least CREATE TEMPORARY and INSERT, if not CREATE).

The other day, Baron Schwartz posted How to round to the nearest whole multiple or fraction in SQL. In an offhand way, he generated some random numbers using the mysql.help_topic table. I then realized that post solved something I’ve been looking for: using a sure-to-exist table on any MySQL installation.

What does the table consist of? It consists, among other columns, an incrementing help_topic_id column:

SELECT help_topic_id FROM mysql.help_topic LIMIT 10;
+---------------+
| help_topic_id |
+---------------+
|             0 |
|             1 |
|             2 |
|             3 |
|             4 |
|             5 |
|             6 |
|             7 |
|             8 |
|             9 |
+---------------+

Still feels unsafe?

The above result provides with sequential integers. But can we guarantee this? Will the numbers never have skipped values? We don’t have to rely on these values. We can force them to our liking:

SELECT @counter := @counter+1 AS value
FROM mysql.help_topic, (SELECT @counter := 0) AS sel1
LIMIT 10;
+-------+
| value |
+-------+
|     1 |
|     2 |
|     3 |
|     4 |
|     5 |
|     6 |
|     7 |
|     8 |
|     9 |
|    10 |
+-------+

All we actually need is the existence of rows within this table. We don’t care which columns, what their names are, and of which data types they are. Said table currently has 484 rows. One can use CROSS JOIN to achieve more than that:

SELECT @counter := @counter+1 AS value
FROM mysql.help_topic t1, mysql.help_topic t2, (SELECT @counter := 0) AS sel1
LIMIT 20000;
+-------+
| value |
+-------+
|     1 |
|     2 |
|     3 |
|     4 |
|     5 |
...
| 19992 |
| 19993 |
| 19994 |
| 19995 |
| 19996 |
| 19997 |
| 19998 |
| 19999 |
| 20000 |
+-------+

Number generation

We are now in full control of generated numbers. We don’t have to generate sequential numbers. We can generate odd numbers only; multiples of 10, of PI… Following I’ll be generating the Fibonacci series:

SELECT @c3 := @c1 + @c2 AS value, @c1 := @c2, @c2 := @c3
FROM mysql.help_topic, (SELECT @c1 := 1, @c2 := 0) sel1
LIMIT 15;
+-------+------------+------------+
| value | @c1 := @c2 | @c2 := @c3 |
+-------+------------+------------+
|     1 |          0 |          1 |
|     1 |          1 |          1 |
|     2 |          1 |          2 |
|     3 |          2 |          3 |
|     5 |          3 |          5 |
|     8 |          5 |          8 |
|    13 |          8 |         13 |
|    21 |         13 |         21 |
|    34 |         21 |         34 |
|    55 |         34 |         55 |
|    89 |         55 |         89 |
|   144 |         89 |        144 |
|   233 |        144 |        233 |
|   377 |        233 |        377 |
|   610 |        377 |        610 |
+-------+------------+------------+

Conclusion

Using 5.0 and above, you can also use the various INFORMATION_SCHEMA tables (e.g. INFORMATION_SCHEMA.COLLATIONS). Some of these may be slow to load, though.

When you can (and need), have a prepared numbers table. When unable to create one, you can generate such numbers using tables which are certain to exist (at least until the next major version).

9 thoughts on “Generating numbers out of seemingly thin air

  1. I mostly use UNIONs for this purpose:

    SELECT a.x*100+b.x*10+c.x
    FROM (
    SELECT 0 x UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
    ) a, (
    SELECT 0 x UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
    ) b, (
    SELECT 0 x UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
    ) c

    I don’t know what’s better. You don’t need more priviledges for this than SELECT and it works with 4.1, but of course that puts more load onto the parser and you have much more UNIONs. I’m hoping that generating the bulk of the data with JOINs uses optimized code in the MySQL server like the JOIN buffer and temporary tables and that the overhead doesn’t matter any more.

  2. @Giuseppe,
    Thanks for the references; Very interesting!

    @Mark,
    What a coincidence!

    @strcmp,
    My personal view is that shorter is usually better. Your solution has the property of choosing exactly 1000 thousand values, though.

  3. as a former assembler and C programmer and number cruncher iterative code like “@counter := @counter + 1” looks suboptimal to me, because it is a data dependency between the iterations, forcing the code to be executed serially. of course that’s just me and a totally useless comment right now, but even MySQL may once leave the stone ages and execute JOINs and table/index scans in parallel… bulk operations ‘feel’ better.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.