REPLACE INTO: think twice

The REPLACE [INTO] syntax allows us to INSERT a row into a table, except that if a UNIQUE KEY (including PRIMARY KEY) violation occurs, the old row is deleted prior to the new INSERT, hence no violation.

Sounds very attractive, and has a nice syntax as well: the same syntax as a normal INSERT INTO’s. It certainly has a nicer syntax than INSERT INTO … ON DUPLICATE KEY UPDATE, and it’s certainly shorter than using a SELECT to see if a row exists, then doing either INSERT or UPDATE.

But weak hearted people as myself should be aware of the following: it is a heavyweight solution. It may be just what you were looking for in terms of ease of use, but the fact is that on duplicate keys, a DELETE and INSERT are performed, and this calls for a closer look. Continue reading » “REPLACE INTO: think twice”

Using memcached functions for MySQL; an automated alternative to Query Cache

There’s a lot of buzz around memcached. memcached is widely used, and has clients for many programming languages and platforms. TangentOrg have developed a memcached client in the form of MySQL UDFs (User Defined Functions).

I wish to discuss the memcached functions for MySQL: if and how they should be used.

Disclaimer: I do not work with memcached functions for MySQL on a production system; all that is written here reflects my opinion on how things should be done.

With memcached functions for MySQL, we can do the following:

SELECT memc_set('mykey', 'The answer is 42');
SELECT memc_get('mykey');

(See my previous post on how to install memcached functions for MySQL).

In what scenario should we use these functions?

I believe memcached is the right tool for the application level. I am less enthusiastic about using it from MySQL. Sure, pushing it down to MySQL centralizes everything. Instead of having all my application code (PHP, Java etc.) access memcached separately, they can all access one single MySQL node, which gets to access memcached. I see two problems with this approach: Continue reading » “Using memcached functions for MySQL; an automated alternative to Query Cache”

Installing memcached functions for MySQL

tangent.org provide a memcached client for MySQL, in the form of UDFs. The provided set of functions allow for connecting to a memcached server, putting values in the cache, getting values, invalidating, utilizing increments etc.

The code is not (yet?) available in binary format, so the libraries need to be compiled and installed manually. Following is a quick installation HOWTO for Linux users. Continue reading » “Installing memcached functions for MySQL”

MySQL’s character sets and collations demystified

MySQL’s character sets and collations are often considered as a mystery, and many users either completely disregard them and keep with the defaults, or set everything to UTF8.

This post will attempt to shed some light on the mystery, and provide with some best practices for use with text columns with regard to character sets. Continue reading » “MySQL’s character sets and collations demystified”

Dynamic sequencing with a single query

It is a known trick to use a session variables for dynamically counting/sequencing rows. The way to go is to SET a variable to zero, then use arithmetic within assignment to increment its value for each row in the SELECTed rows.

For example, the following query lists the top 10 populated countries, using MySQL’s world database:

SELECT Code, Name, Population
FROM Country ORDER BY Population DESC LIMIT 10;

+------+--------------------+------------+
| Code | Name               | Population |
+------+--------------------+------------+
| CHN  | China              | 1277558000 |
| IND  | India              | 1013662000 |
| USA  | United States      |  278357000 |
| IDN  | Indonesia          |  212107000 |
| BRA  | Brazil             |  170115000 |
| PAK  | Pakistan           |  156483000 |
| RUS  | Russian Federation |  146934000 |
| BGD  | Bangladesh         |  129155000 |
| JPN  | Japan              |  126714000 |
| NGA  | Nigeria            |  111506000 |
+------+--------------------+------------+

The results do not provide any sequence number. Nor does the table have an AUTO_INCREMENT or otherwise unique row number. If I were to rate the countries by population, the common trick is:

SET @rank := 0;
SELECT
  @rank := @rank+1 AS rank,
  Code, Name, Population
FROM Country ORDER BY Population DESC LIMIT 10;

+------+------+--------------------+------------+
| rank | Code | Name               | Population |
+------+------+--------------------+------------+
|    1 | CHN  | China              | 1277558000 |
|    2 | IND  | India              | 1013662000 |
|    3 | USA  | United States      |  278357000 |
|    4 | IDN  | Indonesia          |  212107000 |
|    5 | BRA  | Brazil             |  170115000 |
|    6 | PAK  | Pakistan           |  156483000 |
|    7 | RUS  | Russian Federation |  146934000 |
|    8 | BGD  | Bangladesh         |  129155000 |
|    9 | JPN  | Japan              |  126714000 |
|   10 | NGA  | Nigeria            |  111506000 |
+------+------+--------------------+------------+

The first query sets the @rank to zero, so that it is not NULL (since no arithmetic can be done with NULL).  The second query relies on its success.

Can the same be achieved with one query only? That’s more of a problem. Continue reading » “Dynamic sequencing with a single query”

Selecting a specific non aggregated column data in GROUP BY

In a GROUP BY query, MySQL may allow specifying non aggregated columns. For example, using MySQL’s world database, I wish to get the number of countries per continent, along with a “sample” country:

SELECT Continent, COUNT(*), Name
FROM `Country` GROUP BY Continent

+---------------+----------+----------------+
| Continent     | COUNT(*) | Name           |
+---------------+----------+----------------+
| Asia          |       51 | Afghanistan    |
| Europe        |       46 | Albania        |
| North America |       37 | Aruba          |
| Africa        |       58 | Angola         |
| Oceania       |       28 | American Samoa |
| Antarctica    |        5 | Antarctica     |
| South America |       14 | Argentina      |
+---------------+----------+----------------+

What if I want to choose that “sample” country? For example, for each continent, I wish to show the country with the largest population. To simply see the largest population, I would use MAX(Population). But which country is referred? I wish to provide a solution which does not involve sub-queries, HAVING or JOINs. Continue reading » “Selecting a specific non aggregated column data in GROUP BY”

Useful database analysis queries with INFORMATION_SCHEMA

A set of useful queries on INFORMATION_SCHEMA follows. These queries can be used when approaching a new database, to learn about some of its properties, or they can be regularly used on an existing schema, so as to verify its integrity.

I will present queries for:

Less known SQL syntax and functions in MySQL

“Standard SQL” is something you read about. All popular databases have modified version of SQL. Each database adds its own flavor and features to the standard. MySQL is no different.

Some deviations are storage engine dependent. Others are more general. Many, such as INSERT IGNORE, are commonly used. Here’s a list of some MySQL deviations to SQL, which are not so well known. Continue reading » “Less known SQL syntax and functions in MySQL”

Common wrong Data Types compilation

During my work with companies using MySQL, I have encountered many issues with regard to schema design, normalization and indexing. Of the most common errors are incorrect data types definition. Many times the database is designed by programmers or otherwise non-expert DBAs. Some companies do not have the time and cannot spare the effort of redesigning and refactoring their databases, and eventually face poor performance issues.

Here’s a compilation of “the right and the wrong” data types. Continue reading » “Common wrong Data Types compilation”

Dangers of skip-grant-tables

When MySQL’s root password is lost and must be reset, there are two popular ways to create a new password. One of the options is far too popular, in my opinion.

The preferred way of setting a root’s password is by using an init-file. The process for doing this is well explained in MySQL’s manual. Using this method requires creating a simple text file, in which the required

GRANT ALL PRIVILEGES ON *.* TO 'root'@'localhost' IDENTIFY BY '****' WIth GRANT OPTION;

(or, alternatively,  SET PASSWORD ...) statement is written.

An entry must be written to my.cnf, or supplied via command line parameters:

init-file=/tmp/my-init-file.sql

MySQL must then be restarted. Upon restart, and before opening any outside connections, the init-file is executed. Once MySQL is up and running, the init-file entry should be dropped. Continue reading » “Dangers of skip-grant-tables”