Some administrative tasks can be simplified by using common_schema/QueryScript. I’m collecting a bunch of these for documentation. Here’s one for example:
The DBA/developer has the task of retaining only top 3 most populated countries per continent. That is, she has to DELETE 4th, 5th, 6th, … most populated counties in each continent.
Is it possible to work out with a single query? Yes. But the query is not pretty. In fact, it is quite complicated, and either involves unintuitive subqueries, or unintuitive hacks. A normal DBA would not want to write, neither maintain this kind of query, unless top-notch-geek, which is fine.
Since this is a one time job, we just need to get it done. And common_schema/QueryScript provide with the intuitive solution: if we read our demand aloud, we realize we want to delete 4th, 5th, 6th, … populated countries for each continent.
I present a solution made available by QueryScript, and discuss the ways in which the code overcomes limitations, or simplifies complexity:
var $num_countries_to_delete;
foreach($continent, $num_countries: SELECT continent, COUNT(*) FROM world.Country GROUP BY continent)
{
if ($num_countries > 3)
{
set $num_countries_to_delete := $num_countries - 3;
DELETE FROM world.Country WHERE Continent = $continent ORDER BY Population ASC LIMIT :$num_countries_to_delete;
}
}
Discussion
The first thing that should be apparent from the above is that this is a programmatic solution. Queries are declarative, which is why complex ones sometimes look incomprehensible. The above is more straightforward. Continue reading » “MySQL/QueryScript use case: DELETE all but top N records per group”