A simple question I’ve been asked:
Is it possible to merge two UPDATE queries, each on different WHERE conditions, into a single query?
For example, is it possible to merge the following two UPDATE statements into one?
mysql> UPDATE film SET rental_duration=rental_duration+1 WHERE rating = 'G'; Query OK, 178 rows affected (0.01 sec) mysql> UPDATE film SET rental_rate=rental_rate-0.5 WHERE length < 90; Query OK, 320 rows affected (0.01 sec)
To verify our tests, we take a checksum:
mysql> pager md5sum PAGER set to 'md5sum' mysql> SELECT film_id, title, rental_duration, rental_rate FROM film ORDER BY film_id; c2d253c3919efaa6d11487b1fd5061f3 -
Obviously, the following query is incorrect:
mysql> UPDATE film SET rental_duration=rental_duration+1, rental_rate=rental_rate-0.5 WHERE rating = 'G' OR length < 90; Query OK, 431 rows affected (0.03 sec) mysql> pager md5sum PAGER set to 'md5sum' mysql> SELECT film_id, title, rental_duration, rental_rate FROM film ORDER BY film_id; 09d450806e2cd7fa78a83ac5bef72d2b -
Motivation
Why would you want to do that?
- While it may seem strange, the merge can be logically (application-wise) perfectly reasonable.
- The UPDATE may be time consuming – perhaps it requires full table scan on a large table. Doing it with one scan is faster than two scans.
The solution
Use a condition for the SET clauses, optionally drop the WHERE conditions.
UPDATE film SET rental_duration=IF(rating = 'G', rental_duration+1, rental_duration), rental_rate=IF(length < 90, rental_rate-0.5, rental_rate) ; mysql> pager md5sum PAGER set to 'md5sum' mysql> SELECT film_id, title, rental_duration, rental_rate FROM film ORDER BY film_id; c2d253c3919efaa6d11487b1fd5061f3 -
The above query necessarily does a full table scan. If there’s a benefit to using indexes in the WHERE clause, it may still be applied, using an OR condition:
UPDATE film SET rental_duration=IF(rating = 'G', rental_duration+1, rental_duration), rental_rate=IF(length < 90, rental_rate-0.5, rental_rate) WHERE rating = 'G' OR length < 90 ;
If there is a computational overhead to the IF statement, I have not noticed it. This kind of solution plays well when each of the distinct queries requires a full scan, on large tables.
it’s a pity the manual calls this “control-flow-functions”. I proposed “conditional expressions” some time ago, but they stick to this term because it has always been there.
Very usefull! thanks 🙂
I exactly needed something like that… Thnxx for dis Logic…..