Detecting unutilized indexes in MySQL

Samuel N Wekesa
3 min readSep 12, 2023

--

MySQL doesn’t offer entirely dependable methods for obtaining precise details on index usage. Some available methods are listed below, but note that they might be specific to certain versions or necessitate additional plugins for data collection. One such method involves the Performance Schema, which fetches information from ‘performance_schema’.’table_io_waits_summary_by_index_usage’. For this to work, the performance schema must be activated in our production setting and is compatible with both Percona and the standard MySQL. However, this technique might not be entirely reliable since it considers an index unused if it hasn’t led to an IO wait since the last system restart.

Performance schema:

This approach works by fetching the details from the ‘performance_schema’.’table_io_waits_summary_by_index_usage’. To do this we need to have performance schema enabled in our production environment and is supported in both Percona and stock MySQL. This approach is less safe as because it assumes an index is unused if it has never caused IO wait since last restart.

Note: Data collected using this approach is reliable only if the MySQL service has been up and running for long time so all the queries used by the application on regular basis and jobs that execute daily, nightly, weekly and monthly are executed at least once. Before proceeding with this, make sure to check how long MySQL service has been up.

SELECT object_schema, object_name, 
index_name FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE index_name IS NOT NULL AND
index_name != 'PRIMARY' AND count_star = 0
AND object_schema NOT IN ('mysql', 'performance_schema')
ORDER BY object_schema, object_name;

Sys Schema:
This view reads the data from ‘performance_schema.table_io_waits_summary_by_index_usage’. Sys schema is just a way to read the information stored in performance schema tables in more clear and easy way. Since this reads the data from performance schema, it identifies index usage in the same manner as performance schema so it is reliability is same as previous approach.

SELECT * FROM sys.schema_unused_indexes;

Information Schema and innodb_index_stats:
This approach calculates index usage by comparing ‘mysql’.’ innodb_index_stats’ and ‘INFORMATION_SCHEMA’.’INDEX_STATISTICS’ tables. The prerequisite is to have the ‘userstat’ variable turned ON and compare the output of the two.
One of the limitations on User Statistics is, it can only track direct lookups on index keys but not lookups done during constraint checks so foreign keys will be marked as unused indexes.
The next limitation is this approach don’t provide index stats on partitioned tables.

Note: If you wish to use this approach, you must set userstat=on and let the database run for good amount of time so it can collect the details.

 
SET GLOBAL userstat=on;
SELECT DISTINCT s.TABLE_SCHEMA, s.TABLE_NAME, s.INDEX_NAME
FROM information_schema.statistics `s` LEFT JOIN information_schema.index_statistics INDXS
ON (s.TABLE_SCHEMA = INDXS.TABLE_SCHEMA AND
s.TABLE_NAME=INDXS.TABLE_NAME AND
s.INDEX_NAME=INDXS.INDEX_NAME)
WHERE INDXS.TABLE_SCHEMA IS NULL
AND lower(s.TABLE_SCHEMA) like '%schema%name%'
ORDER BY 1,2,3;

Alternatively we can do the following:

SELECT * FROM INFORMATION_SCHEMA.INDEX_STATISTICS
WHERE TABLE_NAME='table_name';
SELECT * FROM mysql.INNODB_INDEX_STATS where table_name = 'table_name';

Percona specific:

The pt-index-usage tool from Percona connects to a MySQL database, reads through slow query log, and uses EXPLAIN to ask optimizer how it will execute each query. When it is finished, it prints out a report on indexes that the queries didn’t use. The tool works only with slow query log but we can feed other logs by converting the logs to the format of that of slow query log by using pt-query-digest tool. After it reads all the events in the log, the tool prints out DROP statements for every index that was not used. It skips indexes for tables that were never accessed by any queries in the log, to avoid false-positive results.
If you don’t specify –quiet, the tool also outputs warnings about statements that cannot be EXPLAINed and similar. These go to standard error. Progress reports are enabled by default. These also go to standard error.

Note: This command should be executed from the linux command line and not on MySQL shell.

pt-index-usage /path/to/slow.log

‘performance_schema’.’table_io_waits_summary_by_index_usage’. For this to work, the performance schema must be activated in our production setting and is compatible with both Percona and the standard MySQL. However, this technique might not be entirely reliable since it considers an index unused if it hasn’t led to an IO wait since the last system restart.

--

--

Samuel N Wekesa

Data Analysis|| Information Technology|| Business Statistics