MySQL Optimization Tips

When the database seems to be "slow" first consider all of the following points as e.g. making a certain query absolutely unnecessary by simply using a more sophisticated algorithm in the application is always the most elegant way of optimising it...

When the database seems to be "slow" first consider all of the following points as e.g. making a certain query absolutely unnecessary by simply using a more sophisticated algorithm in the application is always the most elegant way of optimising it...

Before Starting To Optimise

When the database seems to be "slow" first consider all of the following points as e.g. making a certain query absolutely unnecessary by simply using a more sophisticated algorithm in the application is always the most elegant way of optimising it :)

Finding the bottleneck (CPU, memory, I/O, which queries)
Optimising the application (remove unnecessary queries or cache PHP generated web pages)
Optimising the queries (using indices, temporary tables or different ways of joining)
Optimising the database server (cache sizes etc)
Optimising the system (different filesystem types, swap space and kernel versions)
Optimising the hardware (sometimes indeed the cheapest and fastest way)

To find those bottlenecks the following tools have been found to be helpful:

vmstat: to quickly monitor cpu, memory and I/O usage and decide which is the bottleneck
top: to check the current memory and cpu usage of mysqld as well as of the applications
mytop: to figure out which queries cause trouble
mysql-admin (the GUI application, not to confuse with mysqladmin): to monitor and tune mysql in a very convenient way
mysqlreport: which ouput should be use as kind of step by step check list

Using these tools most applications can also be categorised very broadly using the following groups:

I/O based and reading (blogs, news)
I/O based and writing (web access tracker, accounting data collection)
CPU based (complex content management systems, business apps)

Optimising The Queries

Comparing functions with BENCHMARK

The BENCHMARK() function can be used to compare the speed of MySQL functions or operators. For example:

mysql> SELECT BENCHMARK(100000000, CONCAT('a','b'));
+---------------------------------------+
| BENCHMARK(100000000, CONCAT('a','b')) |
+---------------------------------------+
|                                     0 |
+---------------------------------------+
1 row in set (21.30 sec)

However, this cannot be used to compare queries:

mysql> SELECT BENCHMARK(100, SELECT `id` FROM `lines`);
ERROR 1064 (42000): You have an error in your SQL syntax;
check the manual that corresponds to your MySQL server version for
the right syntax to use near 'SELECT `id` FROM `lignes`)' at line 1

As MySQL needs a fraction of a second just to parse the query and the system is probably busy doing other things, too, benchmarks with runtimes of less than 5-10s can be considered as totally meaningless and equally runtimes differences in that order of magnitude as pure chance.

Analyzing functions with EXPLAIN

When you precede a SELECT statement with the keyword EXPLAIN, MySQL explains how it would process the SELECT, providing information about how tables are joined and in which order.

Using and understanding EXPLAIN is essential when aiming for good performance therefore the relevant chapters of the official documentation are a mandatory reading!

A simple example

The join of two table that both do not have indices:

mysql> explain SELECT * FROM a left join b using (i) WHERE a.i < 2;
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | a     | ALL  | NULL          | NULL | NULL    | NULL |    4 | Using where |
|  1 | SIMPLE      | b     | ALL  | NULL          | NULL | NULL    | NULL |    3 |             |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
2 rows in set (0.01 sec)

Now the second table gets an index and the explain shows that MySQL now knows that only 2 of the 3 rows have to be used.

mysql> ALTER TABLE b ADD KEY(i);
Query OK, 3 rows affected (0.01 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> explain SELECT * FROM a left join b using (i) WHERE a.i < 2;
+----+-------------+-------+------+---------------+------+---------+----------+------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref      | rows | Extra       |
+----+-------------+-------+------+---------------+------+---------+----------+------+-------------+
|  1 | SIMPLE      | a     | ALL  | NULL          | NULL | NULL    | NULL     |    4 | Using where |
|  1 | SIMPLE      | b     | ref  | i             | i    | 5       | test.a.i |    2 |             |
+----+-------------+-------+------+---------------+------+---------+----------+------+-------------+
2 rows in set (0.00 sec)

Now the first table also gets an index so that the WHERE condition can be improved amd MySQL knows that only 1 row from the first table is relevant before even trying to search it in the data file.

mysql> ALTER TABLE a ADD KEY(i);
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> explain SELECT * FROM a left join b using (i) WHERE a.i < 2;
+----+-------------+-------+-------+---------------+------+---------+----------+------+-------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref      | rows | Extra       |
+----+-------------+-------+-------+---------------+------+---------+----------+------+-------------+
|  1 | SIMPLE      | a     | range | i             | i    | 5       | NULL     |    1 | Using where |
|  1 | SIMPLE      | b     | ref   | i             | i    | 5       | test.a.i |    2 |             |
+----+-------------+-------+-------+---------------+------+---------+----------+------+-------------+
2 rows in set (0.02 sec)

Optimising The MySQL Server

Status and server variables

MySQL can be monitored and tuned by watching the status-variables and setting the server-variables which can both be global or per session. The status-variables can be monitored by SHOW [GLOBAL|SESSION] STATUS [LIKE '%foo%'] or mysqladmin [extended-]status. The server-variables can be set in the /etc/mysql/my.cnf file or via SET [GLOBAL|SESSION] VARIABLE foo := bar and be shown with mysqladmin variables or SHOW [GLOBAL|SESSION] VARIABLES [LIKE '%foo%'].

Generally status variables start with a capital letter and server variables with a lowercase one.

When dealing with the above mentioned per-session system variables it should always be considered that those have to be multiplied by max_connections to estimate the maximal memory consumption. Failing to do so can easily lead to server crashes at times of load peaks when more than usual clients connect to the server! A quick and dirty estimation can be made with the following formular:

   min_memory_needed = global_buffers + (thread_buffers * max_connections)

   global_buffers:
       key_buffer
       innodb_buffer_pool
       innodb_log_buffer
       innodb_additional_mem_pool
       net_buffer

   thread_buffers:
       sort_buffer
       myisam_sort_buffer
       read_buffer
       join_buffer
       read_rnd_buffer

Note: Especially when dealing with server settings, all information should be verified in the respective chapters of the official documentation as these are subject of change and the authors of this text lack confirmed knowledge about how the server works internally.

Indices

The first and foremost question that is always asked for SELECT queries is always if indices (aka "keys") are configured and if they are, whether or not they are actually be used by the database server.

1. Check if the indices are actually used

Individual queries can be checked with the "EXPLAIN" command. For the whole server the "Sort_%" variables should be monitored as they indicate how often MySQL had to browse through the whole data file because there was no usable index available.

2. Are the indices buffered

Keeping the indices in memory improves read performance a lot. The quotient of "Key_reads / Key_read_requests" tells how often MySQL actually accessed the index file on disk when it needed a key. Same goes for Key_writes, use mysqlreport to do the math for you here. If the percentage is too high, key_buffer_size for MyISAM and innodb_buffer_pool_size for InnoDB are the corresponding variables to tune.

The Key_blocks_% variables can be used to see how much of the configured key buffer is actually used. The unit is 1KB if not set otherwise in key_cache_block_size. As MySQL uses some blocks internally, key_blocks_unused has to be checked. To estimate how big the buffer should be, the sizes of the relevant .MYI files can be summed up. For InnoDB there is innodb_buffer_pool_size although in this case not only the indices but also the data gets buffered.

3. Further settings

sort_buffer_size (per-thread) is the memory that is used for ORDER BY and GROUP BY. myisam_sort_buffer_size is something completely different and should not be altered.

read_buffer_size (per-thread) is the size of memory chunks that are read from disk into memory at once when doing a full table scan as big tables do not fit into memory completely. This seldomly needs tuning.

Query cache

The main reason not to use any MySQL version below 4.0.1 if you have read-based applications is that beginning with that version, MySQL has the ability to store the result of SELECT queries until their tables are modified.

The Query Cache can be configured with the query_cache_% variables. Most important here are the global query_cache_size and query_cache_limit which prevents single queries with unusual big results larger than this size to use up the whole cache.

Note that the Query Cache blocks have a variable size whose minimum size is query_cache_min_res_unit, so after a complete cache flush the number of free blocks is ideally just one. A large value of Qcache_free_blocks just indicates a high fragmentation.

Worth monitoring are the following variables:

Qcache_free_blocks

If this value is high it indicates a high fragmentation which does not need to be a bad thing though.

Qcache_not_cached

If this value is high there are either much uncachable queries (e.g. because they use functions like now()) or the value for query_cache_limit is too low.

Qcache_lowmem_prunes

This is the amount of old results that have been purged because the cache was full and not because their underlying tables have been modified. query_cache_size must be increased to lower this variable.

Examples:

An empty cache:

mysql> SHOW VARIABLES LIKE 'query_cache_type';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| query_cache_type | ON    |
+------------------+-------+
1 row in set (0.00 sec)

mysql> SHOW VARIABLES LIKE 'query_cache_size';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| query_cache_size | 0     |
+------------------+-------+
1 row in set (0.00 sec)

mysql> SHOW STATUS LIKE 'Qcache%';
+-------------------------+-------+
| Variable_name           | Value |
+-------------------------+-------+
| Qcache_free_blocks      | 0     |
| Qcache_free_memory      | 0     |
| Qcache_hits             | 0     |
| Qcache_inserts          | 0     |
| Qcache_lowmem_prunes    | 0     |
| Qcache_not_cached       | 0     |
| Qcache_queries_in_cache | 0     |
| Qcache_total_blocks     | 0     |
+-------------------------+-------+
8 rows in set (0.00 sec)

A used cache (savannah.gnu.org):

mysql> SHOW VARIABLES LIKE "query_cache_size";
+------------------+----------+
| Variable_name    | Value    |
+------------------+----------+
| query_cache_size | 33554432 |
+------------------+----------+
1 row in set (0.00 sec)

mysql> SHOW STATUS LIKE "Qcache%";
+-------------------------+----------+
| Variable_name           | Value    |
+-------------------------+----------+
| Qcache_free_blocks      | 1409     |
| Qcache_free_memory      | 27629552 |
| Qcache_hits             | 7925191  |
| Qcache_inserts          | 3400435  |
| Qcache_lowmem_prunes    | 2946778  |
| Qcache_not_cached       | 71255    |
| Qcache_queries_in_cache | 4546     |
| Qcache_total_blocks     | 10575    |
+-------------------------+----------+
8 rows in set (0.00 sec)

The matching my.cnf configuration parameter is:

query_cache_size = 32M

To clear the cache (useful when testing a new query's efficiency):

mysql> RESET QUERY CACHE;
Query OK, 0 rows affected (0.00 sec)

Waiting for locks

The Table_locks_% variables show the number of queries that had to wait because the tables they tried to access where currently locked by other queries. These situations can be caused by "LOCK TABLE" statements and also by e.g. simultaneous write accesses to the same table.

Table cache

MySQL needs a certain time just to "open" a table and read its meta data like column names etc. If many threads are trying to access the same table, it is opened multiple times. To speed this up the meta data can be cached in the table_cache. A good value for this setting is the number of max_connections multiplied with the number of usually used tables per SELECT.

Using mysqlreport or by looking at the currently Open_tables and ever since Opened_tables as well as the Uptime the number of necessary table opens per second can be calculated (consider the off-peak times like nights though).

Connections and threads

For every client connection (aka session) MySQL creates a separated thread under the main mysqld process. For big sites with several hundred new connections per second, creating the threads itself can consume a significant amount of time. To speed things up, idle threads can be cached after their client disconnected. As a rule of thumb not more than one thread per second should be newly created. Clients that send several queries to the server should use persistent connections like with PHPs mysql_pconnect() function.

This cache can be configured by thread_cache_size and monitored with the threads_% variables.

To avoid overloads MySQL blocks new connections if more than max_connections are currently in use. Start with max_used_connections and monitor the number of connection that were rejected in Aborted_clients and the ones that timed out in Aborted_connections. Forgotten disconnects from clients that use persistent connections can easily lead to a denial of service situation so be aware! Normally connections are closed after wait_timeout seconds of being idle.

Temporary tables

It is perfectly normal that MySQL creates temporary tables while sorting or grouping results. Those tables are either be held in memory or if too large be written to disk which is naturally be much slower. The number of disk tables among the Created_tmp_% variables should be neglectible or else the settings in max_heap_table_size and tmp_table_size be reconsidered.

Delayed writes

In situations like writing webserver access log files to a database, with many subsequent INSERT queries for rather unimportant data into the same table, the performance can be improved by advising the server to cache the write requests a little while and then send a whole batch of data to disk.

Be aware though that all mentioned methods contradicts ACID compliance because INSERT queries are acknowledged with OK to the client before the data has actually be written to disk and thus can still get lost in case of an power outage or server crash. Additionally the side effects mentioned in the documentation often reads like a patient information leaflet of a modern medicament...

MyISAM tables can be given the DELAY_KEY_WRITE option using CREATE or ALTER TABLE. The drawback is that after a crash the table is automatically marked as corrupt and has to be checked/repaired which can take some time.

InnoDB can be told with innodb_flush_log_at_trx_commit to delay writing the data a bit. In case of a server crash the data itself is supposed to be still consistent, just the indices have to be rebuilt.

INSERT DELAYED works for all engines on a per query base.

Further reading

Useful links regarding optimisation of MySQL servers:

Various newsgroups and the MySQL mailing lists
A guide to mysqlreport
The book High Performance MySQL
Tuning tips from the company EZ

Text is available under the terms of the GNU Free Documentation License. Source: Wikibooks

Author: Wikibooks

mysql technology programming optimization tips