NAME
oak-chunk-update: Perform long, non-blocking UPDATE/DELETE operation in auto managed small chunks
SYNOPSIS
Delete rows from world.City where population is small:
oak-chunk-update –database=world –execute=”DELETE FROM City WHERE Population < 10000000 AND OAK_CHUNK(City)”
Same as above, provide fully qualified table names:
oak-chunk-update –execute=”DELETE FROM world.City WHERE Population < 10000000 AND OAK_CHUNK(world.City)”
Same as above, use 1oo rows chunk size, verbose:
oak-chunk-update –database=world –execute=”DELETE FROM City WHERE Population < 10000000 AND OAK_CHUNK(City)” –chunk-size=100 –verbose
Same as above, print progress:
oak-chunk-update –database=world –execute=”DELETE FROM City WHERE Population < 10000000 AND OAK_CHUNK(City)” –chunk-size=100 –verbose –print-progress
Same as above, do not log to binary log:
oak-chunk-update –database=world –execute=”DELETE FROM City WHERE Population < 10000000 AND OAK_CHUNK(City)” –chunk-size=100 –verbose –print-progress –no-log-bin
Same as above, sleep for 10 milliseconds between chunks:
oak-chunk-update –database=world –execute=”DELETE FROM City WHERE Population < 10000000 AND OAK_CHUNK(City)” –chunk-size=100 –sleep=100 –verbose
Perform an UPDATE operation:
oak-chunk-update –database=world –execute=”UPDATE City SET Population = Population+1 WHERE OAK_CHUNK(City)”
Perform a multi-table UPDATE operation, choose world.City as chunking table:
oak-chunk-update –execute=”UPDATE City, Country SET City.District = ‘unknown’ WHERE City.CountryCode = Country.Code AND Country.Continent = ‘Africa’ AND OAK_CHUNK(City)”
Provide connection parameters. Prompt for password:
oak-chunk-update –user=root –ask-pass –socket=/tmp/mysql.sock –database=world –execute=”DELETE FROM City WHERE Population < 10000000 AND OAK_CHUNK(City)”
Use a defaults file for parameters.
oak-chunk-update –defaults-file=/home/myuser/.my-oak.cnf –database=world –execute=”DELETE FROM City WHERE Population < 10000000 AND OAK_CHUNK(City)”
DESCRIPTION
This utility allows for splitting long running or non-indexed UPDATE/DELETE oprations, optionally multi-table ones.
Long running updating queries are often used. Some examples:
- Purging old table records (e.g. purging old logs).
- Updating a column on a table scale.
- Deleting or updating a small number of rows, but with a non-indexed search condition.
oak-chunk-update splits such long running tasks into small chunks. It also allows for sleep time between chunks. This allows for less lock time, better replication responsiveness (less lag) and less stress on system resources (CPU, IO).
To perform, the utility uses a UNIQUE KEY on a given table, which is used for the splitting process.
Note that the query may involve multiple tables (JOINed), in which case one of the tables must have a UNIQUE KEY.
The utility requires, then:
- At least on of the tables participating in the UPDATE/DELETE query has a UNIQUE KEY.
- The query must indicate to the utility the table for which the UNIQUE KEY is used.
The query must include a hint in the form OAK_CHUNK(table_name) or OAK_CHUNK(database_name.table_name). See SYNOPSIS for examples.
The table indicated in the OAK_CHUNK clause is the table which must contain a UNIQUE KEY, which is used for splitting the query. The utility rewrites the query by iteratively replacing the OAK_CHUNK(…) clause with appropriate values from the UNIQUE KEY.
In case more than one UNIQUE KEY is available on the table, the utility chooses in the following order:
- If there’s a PRIMARY KEY – this is the selected key
- A key for which the first column is non-textual is prefereable to a key for which the first column is textual
- A key with a smaller numeric data type takes precedance
- A key with fewer columns take precedance
OPTIONS
–ask-pass
Prompt for password.
-c CHUNK_SIZE, –chunk-size=CHUNK_SIZE
Number of rows to act on in chunks (default: 1000). 0 means all rows updated in one operation
The lower the number, the shorter any locks are held, but the more operations required and the more total running time.
-d DATABASE, –database=DATABASE
Database name (required unless table is fully qualified)
–defaults-file=DEFAULTS_FILE
Read from MySQL configuration file. Overrides –user, –password, –socket, –port.
Configuration needs to be in the following format:
[client]
user=my_user
password=my_pass
socket=/tmp/mysql.sock
port=3306
-e EXECUTE_QUERY, –execute=EXECUTE_QUERY
Query (UPDATE or DELETE) to execute, which contains a chunk placeholder (required)
-H HOST, –host=HOST
MySQL host (default: localhost)
–no-log-bin
Do not log to binary log (actions will not replicate). This may be useful if the slave already finds it hard to replicate behind master. The utility may be spawned manually on slave machines, therefore utilizing more than one CPU core on those machines, making replication process faster due to parallelism.
-p PASSWORD, –password=PASSWORD
MySQL password
-P PORT, –port=PORT
TCP/IP port (default: 3306)
–print-progress
Show number of affected rows during utility runtime
–sleep=SLEEP_MILLIS
Number of milliseconds to sleep between chunks. Default: 0
-S SOCKET, –socket=SOCKET
MySQL socket file. Only applies when host is localhost
-u USER, –user=USER
MySQL user
-v, –verbose
Print user friendly messages
ENVIRONMENT
Requires MySQL 5.0 or newer, python 2.3 or newer.
python-mysqldb must be installed in order to use this tool. You can
apt-get install python-mysqldb
or
yum install mysql-python
SEE ALSO
LICENSE
This tool is released under the BSD license.
Copyright (c) 2008-2009, Shlomi Noach
All rights reserved.Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
* Neither the name of the organization nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
AUTHOR
Shlomi Noach