Proper SQL table alias use conventions

March 11, 2010

After seeing quite some SQL statements over the years, something is bugging me: there is no consistent convention as for how to write an SQL query.

I'm going to leave formatting, upper/lower-case issues aside, and discuss a small part of the SQL syntax: table aliases. Looking at three different queries, I will describe what I find to be problematic table alias use.

Using the sakila database, take a look at the following queries:

Query #1

SELECT
 R.rental_date, C.customer_id, C.first_name, C.last_name
FROM
 rental R
 JOIN customer C USING (customer_id)
WHERE
 R.rental_date >= DATE('2005-10-01')
 AND C.store_id=1;

The above looks for film rentals done in a specific store (store #1), as of Oct. 1st, 2005.

Query #2

SELECT
 F.title, C.name
FROM
 film AS F
 JOIN film_category AS S ON (F.film_id = S.film_id)
 JOIN category AS C ON (S.category_id = C.category_id)
WHERE F.length > 180;

The above lists the title and category for all films longer than three hours.

Query #3

SELECT c.customer_id, c.last_name
FROM
  customer c
  INNER JOIN address a ON (c.address_id = a.address_id)
  INNER JOIN (
    SELECT
      c.city_id
    FROM
      city AS c
      JOIN country s ON (c.country_id = s.country_id)
    WHERE
      s.country LIKE 'F%'
  ) s1 USING (city_id)
WHERE
  create_date >= DATE('2005-10-01');

The above lists customers created as of Oct. 1st, 2005, and who live in countries starting with an 'F'. The query could be solved without a subquery, but there's a good reason why I made it so.

The problems

I used very different conventions on any one of the queries, and sometimes within each query. And it's common that I see the same on a customer's site, what with having many programmers do the SQL coding. Again, I will only discuss the table aliases conventions. I'll leaver the rest to the reader.

Here's where I see problems:

  • Query #1: In itself, it looks fine. Rental turns to R, Customer turns to C. I will comment on this slightly later on when I provide my full opinion.
  • Query #2: So film turns to F, category turns to C. What should film_category turn into? Out of letters? Let's just go for S, shall we? But S has nothing do with film_category. Yet it's so commonly seen.
  • Query #2: We're using the AS keyword now. We didn't use it before.
  • Queries #1, #2: Hold on. Wasn't C taken for customer in Query #1? Now, in Query #2 it stands for category? I'm beginning to get confused.
  • Query #3: Now aliases are lower case; I was just getting used to them being upper case.
  • Query #3: But, hey, c is back to customer!
  • Query #3: Or, is it? Take a look at the subquery. Theres another c in there! This time it's city! And it's perfectly valid syntax. We actually have two identical aliases in the same query.
  • Query #3: If I could, I would name country with c as well. But I can't. So why not throw in s again?
  • Query #3: and now I don't even bother using the alias when accessing the create_date. Well, there's no such column in any of the other tables!

Proper conventions

What I find so disturbing is that whenever I read a complex query, I need to go back and forth, back and forth between table aliases (found everywhere in the query) and their declaration point. Such irregularities make the queries difficult to read.

Any of the above issues could be justified. But I wish to make some suggestions:

  • Decide whether you're going for upper or lower case.
  • Do not use the same alias twice in your query, even if it's valid.
  • Aliases do not have to be single character. film_category may just as well be FC.
  • Do not alias something that is hard to interpret. s does not stand for country.
  • Think ahead: use same aliases throughout all your queries, as far as you can. If uniqueness is a problem, make for longer aliases. Use cust instead of c.

The above should make for more organized and readable SQL code. Remember: what one programmer finds as a very intuitive alias, is unintuitive to another!

My own convention

Simple: I only use aliases when using self joins. I am aware that queries are much longer what with long table names. I go farther than that: I prefer fully qualifying questionable columns throughout the query. Yes, it makes the query even longer.

I know this does not appeal to many. But there's no confusion. And it's easily searchable. And it's consistent. And if properly formatted, as in the above queries, is well readable.

Now please join me in asking Oracle if they can add multi-line Strings for java, as there are for python.

tags: , , , ,
posted in MySQL by shlomi

« | »

Follow comments via the RSS Feed | Leave a comment | Trackback URL

9 Comments to "Proper SQL table alias use conventions"

  1. Bill Karwin wrote:

    As with any other coding language, the specific details of your coding conventions are not as important as *having* coding conventions, and following them more or less consistently.

    I think your recommendations are fine. You are, however, missing any recommendations for how to name aliases when you use self-joins.

    I find that sometimes, the alias may even be longer than the table name:

    SELECT later_customer.*
    FROM 
      customers AS first_customer
      JOIN customers AS later_customer ON (first_customer.cust_id = hypothetical_earlier_customer.cust_id)
    WHERE 
      hypothetical_earlier_customer.cust_id IS NULL;
    
  2. Bill Karwin wrote:

    Er, part of my example query above was eaten. :(

  3. shlomi wrote:

    @Bill

    Thank you. Yes, I completely agree. Having a convention is the most important part; I only gave my recommendations.
    Indeed I did not tackle the self join issue. Thank you for your example! (I hope I fixed your code correctly?)

  4. shlomi wrote:

    Hmmm... something still missing in the query...

  5. hed wrote:

    As the semi proud originator of the aliasing convention post.
    (not because of the well behaved queries)

    I want to add
    that we used to hold our queries in the Java code, and it was horrible.
    most annoying was the time it took to switch from code to shell execute. (the query formatting)
    So we moved to an XML repository which is much better.
    Queries are much more readable and extractable.

  6. shlomi wrote:

    Hi Hed,

    Indeed, the XML repository makes for more readable queries, plus it concentrates queries in one place. Which is why Springframework, Ibatis etc. support and encourage XML files.
    One can argue whether this is the better method. I confess I still like to see my queries next to the relevant code. But I believe this is mostly a development-tool need-to-solve problem.

    Shlomi

  7. Sheeri Cabral wrote:

    I personally hate single or 2-character table aliases, because when using EXPLAIN, you see the table alias only......

  8. PHP-help » Log Buffer #182, a Carnival of the Vanities for DBAs wrote:

    [...] is no consistent convention as for how to write [about table aliases in] an SQL query” in proper sql table alias use conventions. Noach also gives us a tip: faster than [...]

  9. Log Buffer #182, A Carnival of the Vanities for DBAs wrote:

    [...] is no consistent convention as for how to write [about table aliases in] an SQL query” in proper sql table alias use conventions. Noach also gives us a tip: faster than [...]

Leave Your Comment

 
Powered by Wordpress and MySQL. Theme by openark.org