The quiz presented poses with an uncommon, though valid SQL syntax: one is allowed to use quoted name aliases. Thus, it is valid to write:
SELECT Name AS 'n', Continent AS 'c' FROM countries
But what does the above mean? Let’s see the results of our three questions:
Question #1
SELECT id, Name AS 'n', Continent AS 'c' FROM countries ORDER BY c ; +----+--------------------------+---------------+ | id | n | c | +----+--------------------------+---------------+ | 15 | North Korea | Asia | | 12 | Nepal | Asia | | 11 | Norway | Europe | | 10 | Netherlands | Europe | | 8 | Nicaragua | North America | | 1 | Netherlands Antilles | North America | | 3 | Namibia | Africa | | 5 | Niger | Africa | | 7 | Nigeria | Africa | | 9 | Niue | Oceania | | 4 | New Caledonia | Oceania | | 2 | Northern Mariana Islands | Oceania | | 13 | Nauru | Oceania | | 14 | New Zealand | Oceania | | 6 | Norfolk Island | Oceania | +----+--------------------------+---------------
Result is sorted by continent, as may be expected. We have aliased continent as ‘c’, and ORDER BY c went well. The quotes were meaningless here.
Question #2
SELECT MIN(Name) AS 'n', Continent AS 'c' FROM countries GROUP BY c ; +----------------------+---------------+ | n | c | +----------------------+---------------+ | Nepal | Asia | | Netherlands | Europe | | Netherlands Antilles | North America | | Namibia | Africa | | Nauru | Oceania | +----------------------+---------------+
As with first query, the quoted ‘c’ acts just like non-quoted c.
Question #3
SELECT MIN(Name) AS 'n', Continent AS 'c' FROM countries GROUP BY 'c' ; +---------+---------------+ | n | c | +---------+---------------+ | Namibia | North America | +---------+---------------+
Obviously something went wrong here. The query is valid, yet returns a single row. Why is that?
What we’ve just done is to group by the constant ‘c’, not on the alias c. Results are complete wrong.
This query is a simplified version of a true customer’s query. The GROUP BY clause included some 5 fields, the constant literal being one of them. The results of that query were used in very important business decision making. Well, decisions were misinformed…
Quoted aliases?
I’m not sure why anyone would want these. Why is this even a valid syntax?
Multi-word aliases? Seems like wrong coding convention. But, if you must, do not use single quotes: use `backticks`.
I know you’re heavily invested in MySql, but really, what you show here is a bug, not a feature.
Most database systems I’ve worked with differentiate quoted aliases from string literals: aliases use double-quotes (“), string literals use single quotes. IIRC, this is true for Oracle, SQL Server, and PostgreSQL.
The third example is particularly damning: Standard SQL says this is an error — in a summary select (one having ‘group by’), you’re only allowed to select either aggregates (min, max, sum etc) or columns named in the ‘group by’ clause. This is not ideal — the ideal would be to allow any columns that are functionally dependent on the columns grouped by; but it is still much better than producing erroneous, essentially random results as MySql does.
This is one of the reasons I recommend to anyone who asks to stay away from MySql.
@Shai,
Thank you. Indeed, the third query only works with a relaxed sql_mode. With ‘ONLY_FULL_GROUP_BY’ this leads to an error, as you would expect.
I should have expected this; relaxed sql_mode has led me before into many pitfalls.
Regardless of being ‘invested’ in MySQL: I have a lot of complaints on MySQL, like here and here, both relevant to the topic of this article;
I’m not sure, though, those consist of “deal breakers”. As contrast, I can provide with a list of non-standard SQL features available in MySQL which actually improve life, some are long-time requested on PostgreSQL.
Anyway, this is not a feature-argument (Oracle will clearly win). I acknowledge the downsides of MySQL; I do not recommend staying away from MySQL for these. Be aware they exist.
It’s worth noting that on top of having a sql mode to protect from the non-full GROUP BY, the identifier usage is also documented (and hence not a bug):
“In the select list of a query, a quoted column alias can be specified using identifier or string quoting characters:
mysql> SELECT 1 AS `one`, 2 AS ‘two’;
+—–+—–+
| one | two |
+—–+—–+
| 1 | 2 |
+—–+—–+
Elsewhere in the statement, quoted references to the alias must use identifier quoting or the reference is treated as a string literal.”
http://dev.mysql.com/doc/refman/5.1/en/identifiers.html
@Mark
thanks. I think Shai’s point is that it does not matter if it’s documented (hence officially not a bug) or not; he reject the entire notion of this allowed syntax, as being non-conforming to standards.
I can see his point on this.
@Shlomi
Thank you.
The reason I recommend against MySql is not a specific failing on this point or that, but an attitude that I see behind a set of them; a preference for short-term developer convenience, at the expense of operational correctness. One instance of this is the faulty default group-by behavior; other cases where MySql prefers to “smooth out” errors and make the program seem to work are silent data truncation and date validation (or is February 30th not a valid date by default now?). I also attribute MyISAM (no transactions or FKs) to this attitude. These are just the cases that pop to mind, and I’ve only been an occasional MySql user. You, as a more experienced user, have indeed seen more of the same, as your links show.
Changing an attitude is extremely difficult. It isn’t just your developers you need to re-educate, it is also the user community. Some people really liked having a file-per-table that you can manually mess with. You are likely to find many examples of invalid code in the MySql webosphere, because such code seems to work. If MySql suddenly turned strict, I think you’ll get a reaction similar to what Microsoft got when they introduced UAC to Vista.
And all the while, the sane, strict, functionally-superior alternatives are just as free, if not free-er (PostgreSQL uses an MIT license). Why bother with MySql?