mysql复合索引和最左前缀

表结构

id user_id post_id car_id
int int int int

表数据2712617

无索引

select * from `most_left_index` where `user_id` = 1085375 and `post_id` = 399754 and `car_id` = 1351155 limit 1;

查询分析

这时候我们看到了是全表扫描,且都没用用到索引

QueryExecuted {#171 ▼
  +sql: "select * from `most_left_index` where `user_id` = ? and `post_id` = ? and `car_id` = ? limit 1"
  +bindings: array:3 [▶]
  +time: 757.07
  +connection: MySqlConnection {#172 ▶}
  +connectionName: "mysql"
}
+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table           | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | most_left_index | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 2712617 |     0.10 | Using where |
+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+-------------+

建立单列索引

给user_id建立一个普通索引

ALTER TABLE `studymysql`.`most_left_index` 
ADD INDEX `idx_user_id` (`user_id` ASC);
QueryExecuted {#171 ▼
  +sql: "select * from `most_left_index` where `user_id` = ? and `post_id` = ? and `car_id` = ? limit 1"
  +bindings: array:3 [▶]
  +time: 6.45
  +connection: MySqlConnection {#172 ▶}
  +connectionName: "mysql"
}
mysql> explain select * from `most_left_index` where `user_id` = 1085375 and `post_id` = 399754 and `car_id` = 1351155 limit 1;
+----+-------------+-----------------+------------+------+---------------+-------------+---------+-------+------+----------+-------------+
| id | select_type | table           | partitions | type | possible_keys | key         | key_len | ref   | rows | filtered | Extra       |
+----+-------------+-----------------+------------+------+---------------+-------------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | most_left_index | NULL       | ref  | idx_user_id   | idx_user_id | 4       | const |    1 |     5.00 | Using where |
+----+-------------+-----------------+------------+------+---------------+-------------+---------+-------+------+----------+-------------+

在mysql中执行查询时,只能使用一个索引,如果我们在user_id,post_id,car_id上分别建索引,执行查询时,只能使用一个索引,mysql会选择一个最严格(获得结果集记录数最少)的索引。

执行顺序

  • user_id列建索引,这样就把范围限制在user_id=1085375的结果集1上,
  • 之后扫描结果集1,产生满足post_id=399754的结果集2
  • 再扫描结果集2,找到car_id=1351155的结果集3,即最终结果。

由于建立了user_id列的索引,与执行表的完全扫描相比,效率提高了很多,但我们要求扫描的记录数量仍旧远远超过了实际所需 要的。虽然我们可以删除user_id列上的索引,再创建post_id或者car_id列的索引,但是,不论在哪个列上创建索引搜索效率仍旧相似。

建立复合索引

为了提高搜索效率,我们需要考虑运用多列索引,由于索引文件以B-Tree格式保存,所以我们不用扫描任何记录,即可得到最终结果。

file

ALTER TABLE `studymysql`.`most_left_index` 
DROP INDEX `idx_user_id` ,
ADD INDEX `idx_user_id_post_id_car_id` (`user_id` ASC, `post_id` ASC, `car_id` ASC);
QueryExecuted {#171 ▼
  +sql: "select * from `most_left_index` where `user_id` = ? and `post_id` = ? and `car_id` = ? limit 1"
  +bindings: array:3 [▶]
  +time: 6.01
  +connection: MySqlConnection {#172 ▶}
  +connectionName: "mysql"
}
mysql> explain select * from `most_left_index` where `user_id` = 1085375 and `post_id` = 399754 and `car_id` = 1351155 limit 1;
+----+-------------+-----------------+------------+------+----------------------------+----------------------------+---------+-------------------+------+----------+-------+
| id | select_type | table           | partitions | type | possible_keys              | key                        | key_len | ref               | rows | filtered | Extra |
+----+-------------+-----------------+------------+------+----------------------------+----------------------------+---------+-------------------+------+----------+-------+
|  1 | SIMPLE      | most_left_index | NULL       | ref  | idx_user_id_post_id_car_id | idx_user_id_post_id_car_id | 12      | const,const,const |    1 |   100.00 | NULL  |
+----+-------------+-----------------+------------+------+----------------------------+----------------------------+---------+-------------------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)

最左前缀

最左前缀:顾名思义,就是最左优先,上例中我们创建了idx_user_id_post_id_car_id多列索引,相当于创建了

  • (user_id)单列索引,
  • (user_id,post_id)组合索引
  • (user_id,car_id)组合索引
  • (user_id,post_id,car_id)组合索引。

注:在创建多列索引时,要根据业务需求,where子句中使用最频繁的一列放在最左边。

where字句中不带user_id的执行效率

QueryExecuted {#171 ▼
  +sql: "select * from `most_left_index` where `post_id` = ? and `car_id` = ? limit 1"
  +bindings: array:2 [▶]
  +time: 1456.93
  +connection: MySqlConnection {#172 ▶}
  +connectionName: "mysql"
}
mysql> explain select * from `most_left_index` where `post_id` = 399754 and `car_id` = 1351155 limit 1;
+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table           | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | most_left_index | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 2712617 |     1.00 | Using where |
+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

这个时候明显全表扫描了,因为复合索引,并没有post_idcar_id的索引

复合索引顺便改变

file

ALTER TABLE `studymysql`.`most_left_index` 
ADD INDEX `idx_post_id_user_id_car_id` (`post_id` ASC, `user_id` ASC, `car_id` ASC);
QueryExecuted {#171 ▼
  +sql: "select * from `most_left_index` where `post_id` = ? and `car_id` = ? limit 1"
  +bindings: array:2 [▶]
  +time: 8.25
  +connection: MySqlConnection {#172 ▶}
  +connectionName: "mysql"
}
mysql> explain select * from `most_left_index` where `post_id` = 399754 and `car_id` = 1351155 limit 1;
+----+-------------+-----------------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-----------------------+
| id | select_type | table           | partitions | type | possible_keys              | key                        | key_len | ref   | rows | filtered | Extra                 |
+----+-------------+-----------------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-----------------------+
|  1 | SIMPLE      | most_left_index | NULL       | ref  | idx_post_id_user_id_car_id | idx_post_id_user_id_car_id | 4       | const |    2 |    10.00 | Using index condition |
+----+-------------+-----------------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-----------------------+

这时候我们改变了复合索引的顺序,post_idcar_id就用到了索引

总结

建立复合索引,一定要把最常用的,用的最多的那个字段作为索引的第一列

参考文献

mysql多列索引和最左前缀