SQL: selecting top N records per group, another solution

A while back I presented SQL: selecting top N records per group, a “give me the top 5 countries in each continent” type of query, and which used an external numbers table and a lot of tedious casting.

Here’s another solution I came up with (*). Still using GROUP_CONCAT (how else?), but no external table and no casting. The query outputs the largest 5 countries (by surface area) per continent.

SELECT
  Continent,
  Name,
  SurfaceArea,
  Population
FROM
  world.Country,
  (
    SELECT 
      GROUP_CONCAT(top_codes_per_group) AS top_codes
    FROM
      (
        SELECT 
          SUBSTRING_INDEX(GROUP_CONCAT(Code ORDER BY SurfaceArea DESC), ',', 5) AS top_codes_per_group
        FROM
          world.Country
        GROUP BY
          Continent
      ) s_top_codes_per_group
  ) s_top_codes
WHERE
  FIND_IN_SET(Code, top_codes)
ORDER BY
  Continent,
  SurfaceArea DESC
;

+---------------+----------------------------------------------+-------------+------------+
| Continent     | Name                                         | SurfaceArea | Population |
+---------------+----------------------------------------------+-------------+------------+
| Asia          | China                                        |  9572900.00 | 1277558000 |
| Asia          | India                                        |  3287263.00 | 1013662000 |
| Asia          | Kazakstan                                    |  2724900.00 |   16223000 |
| Asia          | Saudi Arabia                                 |  2149690.00 |   21607000 |
| Asia          | Indonesia                                    |  1904569.00 |  212107000 |
| Europe        | Russian Federation                           | 17075400.00 |  146934000 |
| Europe        | Ukraine                                      |   603700.00 |   50456000 |
| Europe        | France                                       |   551500.00 |   59225700 |
| Europe        | Spain                                        |   505992.00 |   39441700 |
| Europe        | Sweden                                       |   449964.00 |    8861400 |
| North America | Canada                                       |  9970610.00 |   31147000 |
| North America | United States                                |  9363520.00 |  278357000 |
| North America | Greenland                                    |  2166090.00 |      56000 |
| North America | Mexico                                       |  1958201.00 |   98881000 |
| North America | Nicaragua                                    |   130000.00 |    5074000 |
| Africa        | Sudan                                        |  2505813.00 |   29490000 |
| Africa        | Algeria                                      |  2381741.00 |   31471000 |
| Africa        | Congo, The Democratic Republic of the        |  2344858.00 |   51654000 |
| Africa        | Libyan Arab Jamahiriya                       |  1759540.00 |    5605000 |
| Africa        | Chad                                         |  1284000.00 |    7651000 |
| Oceania       | Australia                                    |  7741220.00 |   18886000 |
| Oceania       | Papua New Guinea                             |   462840.00 |    4807000 |
| Oceania       | New Zealand                                  |   270534.00 |    3862000 |
| Oceania       | Solomon Islands                              |    28896.00 |     444000 |
| Oceania       | New Caledonia                                |    18575.00 |     214000 |
| Antarctica    | Antarctica                                   | 13120000.00 |          0 |
| Antarctica    | French Southern territories                  |     7780.00 |          0 |
| Antarctica    | South Georgia and the South Sandwich Islands |     3903.00 |          0 |
| Antarctica    | Heard Island and McDonald Islands            |      359.00 |          0 |
| Antarctica    | Bouvet Island                                |       59.00 |          0 |
| South America | Brazil                                       |  8547403.00 |  170115000 |
| South America | Argentina                                    |  2780400.00 |   37032000 |
| South America | Peru                                         |  1285216.00 |   25662000 |
| South America | Colombia                                     |  1138914.00 |   42321000 |
| South America | Bolivia                                      |  1098581.00 |    8329000 |
+---------------+----------------------------------------------+-------------+------------+

In bold are the conditions by which we nominate our selected rows (condition is SurfaceArea DESC, number of rows is 5, so 5 largest countries).

Continue reading » “SQL: selecting top N records per group, another solution”

Ways to export MySQL result set to file on client side

Problem: you wish to write table data to file, but you wish to do so on client side.

SELECT … INTO OUTFILE writes the file on server. What are your options on client?

1. mysql client

If you have direct access from your client machine to your DB server machine, and can connect via mysql client, you get a very customizable file write:

bash$ mysql -h db_host --execute "SELECT * FROM my_table" > /tmp/output.txt

The above writes fancy tables, so you probably want to:

bash$ mysql -h db_host --execute "SELECT * FROM my_table" --bat > /tmp/output.txt

Also try:

bash$ mysql -h db_host --execute "SELECT * FROM my_table" --silent --raw > /tmp/output.txt

To turn off headers.

More options include the –xml or –html formats. Continue reading » “Ways to export MySQL result set to file on client side”

Things that can’t (and some that can) be done from within a MySQL stored routine

I’m doing a lot of stored routine programming lately, working on common_schema. I’m in particular touching at the extremes of abilities. Some things just can’t be done from within a stored routine. Here’s a list of can’t be done:

  • Cursor for SHOW statements: can’t be done — this is explicitly blocked from operating (it once used to work).
  • Get detailed error information on exceptions: apparently 5.6 has support for this. 5.1 and 5.5 do not.
  • Change binlog_format: this is obvious, if you think about it. binlog_format dictates how the routine itself is replicated in the first place.
  • Set sql_log_bin. Again, this makes sense.
  • Work out different results depending on current machine. For example, you can’t have a routine that returns with ‘master’ on the master and with ‘slave’ on the slave. That is, not under any condition. Consider: if Row Based Replication is used, you don’t actually have a routine executing on the slave. I’m happy to be proven wrong on this.
  • Know what database was in use by calling code. The routine executes within the context of the database where it is defined. But you can’t tell what database was in use just a couple milliseconds before.
  • Likewise, know what sql_mode was in use by calling code. Stored routines have their own sql_mode – the one they were created with. No way to check up on the calling stack.
  • And you can’t USE another database (database as in schema). USE is a client command.
  • Reconnect after failure (kind of obvious, isn’t it?)
  • Connect to other servers (not so obvious to SQL Server DBAs). You can’t issue queries on other servers. Bummer.
  • Shutdown the server
  • Fork (you’re in a connection, you can’t issue a new connection from your own connection)

Continue reading » “Things that can’t (and some that can) be done from within a MySQL stored routine”