MySQL监控与性能指标

MySQL监控与性能指标

MySQL 是业务系统中最常见的数据库,但用好它并不容易。索引设计、SQL 优化、事务处理都是日常开发中需要关注的点。本文从实际场景出发,讲常见问题和解决思路。

一、监控维度

#

1.1 四个黄金指标

维度 指标 说明
吞吐量 QPS/TPS 每秒查询/事务数
延迟 查询响应时间 SQL执行耗时
错误 慢查询/错误率 性能问题和异常
饱和度 连接数/CPU/IO 资源使用程度

#

1.2 监控层次

┌─────────────────────────────────────┐
│ 业务层:慢查询、错误率、响应时间 │
├─────────────────────────────────────┤
│ 实例层:QPS、连接数、复制状态 │
├─────────────────────────────────────┤
│ 存储层:磁盘IO、Buffer Pool命中率 │
├─────────────────────────────────────┤
│ 系统层:CPU、内存、网络、磁盘 │
└─────────────────────────────────────┘

二、核心性能指标

#

2.1 吞吐量指标

-- QPS(每秒查询数)
SHOW GLOBAL STATUS LIKE 'Queries';
-- QPS = (Queries2 - Queries1) / (time2 - time1)

-- TPS(每秒事务数)
SHOW GLOBAL STATUS LIKE 'Com_commit';
SHOW GLOBAL STATUS LIKE 'Com_rollback';
-- TPS = (Com_commit + Com_rollback) / 时间差

-- 各类型SQL执行次数
SHOW GLOBAL STATUS LIKE 'Com_%';

#

2.2 连接指标

-- 当前连接数
SHOW GLOBAL STATUS LIKE 'Threads_connected';

-- 活跃连接数
SHOW GLOBAL STATUS LIKE 'Threads_running';

-- 历史最大连接数
SHOW GLOBAL STATUS LIKE 'Max_used_connections';

-- 连接创建数
SHOW GLOBAL STATUS LIKE 'Connections';

-- 连接缓存命中率
-- Threads_created / Connections 应接近0

#

2.3 查询缓存指标(MySQL 8.0已移除)

-- 仅适用于MySQL 5.7及之前
SHOW GLOBAL STATUS LIKE 'Qcache%';

#

2.4 InnoDB指标

-- Buffer Pool命中率
SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_read_requests';
SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_reads';
-- 命中率 = 1 - reads / read_requests

-- 行操作统计
SHOW GLOBAL STATUS LIKE 'Innodb_rows_%';

-- 锁等待
SHOW GLOBAL STATUS LIKE 'Innodb_row_lock_waits';
SHOW GLOBAL STATUS LIKE 'Innodb_row_lock_time_avg';

-- Buffer Pool页状态
SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_pages_%';

-- 日志写入
SHOW GLOBAL STATUS LIKE 'Innodb_log_waits';
SHOW GLOBAL STATUS LIKE 'Innodb_os_log_fsyncs';

#

2.5 复制指标

-- 从库延迟
SHOW SLAVE STATUS \G;
-- Seconds_Behind_Master

-- IO线程状态
-- Slave_IO_Running: Yes
-- Slave_SQL_Running: Yes

-- 主库binlog位置
SHOW MASTER STATUS;

#

2.6 慢查询指标

-- 慢查询次数
SHOW GLOBAL STATUS LIKE 'Slow_queries';

-- 全表扫描次数
SHOW GLOBAL STATUS LIKE 'Select_scan';
SHOW GLOBAL STATUS LIKE 'Select_full_join';

-- 创建临时表次数
SHOW GLOBAL STATUS LIKE 'Created_tmp_%';

三、Performance Schema监控

#

3.1 开启配置

[mysqld]
performance_schema = ON

#

3.2 常用查询

-- 查看最耗时的SQL
SELECT
DIGEST_TEXT AS query,
COUNT_STAR AS exec_count,
AVG_TIMER_WAIT / 1000000000000 AS avg_time_sec,
MAX_TIMER_WAIT / 1000000000000 AS max_time_sec,
SUM_ROWS_SENT AS total_rows_sent,
SUM_ROWS_EXAMINED AS total_rows_examined
FROM performance_schema.events_statements_summary_by_digest
ORDER BY SUM_TIMER_WAIT DESC
LIMIT 10;

-- 查看表IO统计
SELECT
OBJECT_SCHEMA,
OBJECT_NAME,
COUNT_READ,
COUNT_WRITE,
SUM_NUMBER_OF_BYTES_READ,
SUM_NUMBER_OF_BYTES_WRITE
FROM performance_schema.table_io_waits_summary_by_table
ORDER BY SUM_TIMER_WAIT DESC
LIMIT 10;

-- 查看锁等待
SELECT
OBJECT_SCHEMA,
OBJECT_NAME,
COUNT_READ,
COUNT_WRITE,
SUM_TIMER_WAIT
FROM performance_schema.table_lock_waits_summary_by_table
ORDER BY SUM_TIMER_WAIT DESC;

-- 查看文件IO
SELECT
EVENT_NAME,
COUNT_READ,
COUNT_WRITE,
SUM_NUMBER_OF_BYTES_READ,
SUM_NUMBER_OF_BYTES_WRITE
FROM performance_schema.file_summary_by_event_name
WHERE EVENT_NAME LIKE '%innodb%'
ORDER BY SUM_TIMER_WAIT DESC
LIMIT 10;

#

3.3 线程监控

-- 查看当前线程
SELECT
THREAD_ID,
NAME,
PROCESSLIST_ID,
PROCESSLIST_USER,
PROCESSLIST_HOST,
PROCESSLIST_DB,
PROCESSLIST_COMMAND,
PROCESSLIST_STATE,
PROCESSLIST_TIME,
PROCESSLIST_INFO
FROM performance_schema.threads
WHERE TYPE = 'FOREGROUND'
ORDER BY PROCESSLIST_TIME DESC;

四、sys系统库

MySQL 5.7+ 提供的便捷视图:

-- 查看最慢的SQL(格式化后)
SELECT * FROM sys.statements_with_full_table_scans;

-- 查看执行时间最长的SQL
SELECT * FROM sys.statements_with_runtimes_in_95th_percentile;

-- 查看IO最多的表
SELECT * FROM sys.io_global_by_file_by_bytes;

-- 查看使用临时表最多的SQL
SELECT * FROM sys.statements_with_temp_tables;

-- 查看未使用索引的SQL
SELECT * FROM sys.statements_with_full_table_scans;

-- 查看InnoDB锁等待
SELECT * FROM sys.innodb_lock_waits;

-- 查看主机摘要
SELECT * FROM sys.host_summary;

五、Prometheus + Grafana监控

#

5.1 mysqld_exporter

# 下载安装
wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.15.0/mysqld_exporter-0.15.0.linux-amd64.tar.gz
tar xzf mysqld_exporter-*.tar.gz

# 创建MySQL监控账号
CREATE USER 'exporter'@'localhost' IDENTIFIED BY 'password';
GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'localhost';

# 配置
cat > ~/.my.cnf <<EOF
[client]
user=exporter
password=password
EOF

# 启动
./mysqld_exporter --config.my-cnf=~/.my.cnf

#

5.2 Prometheus配置

# prometheus.yml
scrape_configs:
- job_name: 'mysql'
static_configs:
- targets: ['localhost:9104']
labels:
instance: 'mysql-prod-01'

#

5.3 Grafana仪表盘

导入MySQL监控Dashboard(ID: 7362 或 11323):

核心面板:

  • MySQL Uptime
  • Current QPS
  • Connections
  • Active Threads
  • Slow Queries
  • Replication Lag
  • Buffer Pool Hit Ratio

#

5.4 告警规则

# mysql_alerts.yml
groups:
- name: mysql
rules:
- alert: MySQLDown
expr: mysql_up == 0
for: 1m
labels:
severity: critical
annotations:
summary: "MySQL实例宕机"

- alert: MySQLHighConnections
expr: mysql_global_status_threads_connected / mysql_global_variables_max_connections > 0.8
for: 5m
labels:
severity: warning
annotations:
summary: "MySQL连接数超过80%"

- alert: MySQLSlowQueries
expr: rate(mysql_global_status_slow_queries[5m]) > 1
for: 5m
labels:
severity: warning
annotations:
summary: "MySQL慢查询增长"

- alert: MySQLReplicationLag
expr: mysql_slave_lag_seconds > 10
for: 5m
labels:
severity: warning
annotations:
summary: "MySQL复制延迟超过10秒"

- alert: MySQLBufferPoolHitRatio
expr: rate(mysql_global_status_innodb_buffer_pool_reads[5m]) / rate(mysql_global_status_innodb_buffer_pool_read_requests[5m]) > 0.01
for: 10m
labels:
severity: warning
annotations:
summary: "Buffer Pool命中率低于99%"

六、慢查询监控

#

6.1 实时慢查询分析

-- 查看当前执行时间超过10秒的查询
SELECT
THREAD_ID,
PROCESSLIST_ID,
PROCESSLIST_USER,
PROCESSLIST_HOST,
PROCESSLIST_DB,
PROCESSLIST_TIME,
PROCESSLIST_INFO
FROM performance_schema.threads
WHERE PROCESSLIST_COMMAND = 'Query'
AND PROCESSLIST_TIME > 10
ORDER BY PROCESSLIST_TIME DESC;

#

6.2 pt-kill自动杀慢查询

# 杀掉执行超过60秒的SELECT
pt-kill --host=localhost --user=root --password=password \
--match-command='Query' \
--match-state='Sending data|Copying to tmp table' \
--busy-time=60 \
--victims=all \
--interval=10 \
--kill \
--print

七、容量规划

#

7.1 数据增长趋势

-- 查看数据库大小
SELECT
table_schema,
ROUND(SUM(data_length + index_length) / 1024 / 1024 / 1024, 2) AS size_gb
FROM information_schema.tables
GROUP BY table_schema
ORDER BY size_gb DESC;

-- 查看表大小
SELECT
table_name,
ROUND(data_length / 1024 / 1024, 2) AS data_mb,
ROUND(index_length / 1024 / 1024, 2) AS index_mb,
table_rows
FROM information_schema.tables
WHERE table_schema = 'mydb'
ORDER BY (data_length + index_length) DESC
LIMIT 20;

#

7.2 QPS增长趋势

使用Prometheus查询:

# QPS趋势
rate(mysql_global_status_queries[1m])

# 连接数趋势
mysql_global_status_threads_connected

# 磁盘使用趋势
mysql_global_status_innodb_data_written

八、监控检查清单

#

8.1 每日检查

  • 复制延迟是否正常
  • 慢查询数量是否增长
  • 连接数是否在正常范围
  • 是否有死锁发生

#

8.2 每周检查

  • Buffer Pool命中率
  • 表空间增长情况
  • 索引使用情况
  • 备份状态

#

8.3 每月检查

  • 容量规划(数据增长趋势)
  • 全量SQL审核
  • 性能基准对比
  • 监控告警规则 review

九、总结

指标类别 关键指标 告警阈值
可用性 mysql_up = 0
连接 Threads_connected/max_connections > 80%
复制 Seconds_Behind_Master > 10秒
查询 Slow_queries增长率 突增
缓冲 Buffer Pool命中率 < 95%
Innodb_row_lock_waits 持续增加
磁盘 磁盘使用率 > 80%

监控的核心价值:

  1. 及时发现问题,缩短故障时间
  2. 掌握性能趋势,提前容量规划
  3. 量化优化效果,指导调优方向
  4. 提供数据支撑,辅助决策判断

核心要点

  1. 索引设计的原则:最左前缀原则、避免索引列参与计算、选择合适的索引类型

  2. SQL 优化的方法:使用 EXPLAIN 分析执行计划、避免 SELECT *、优化 JOIN

  3. 事务隔离级别的选择:根据业务需求选择合适的级别

  4. 常见的锁问题:行锁、表锁、死锁的处理

总结

MySQL 优化是一个持续的过程,需要结合业务特点和数据量来调整。在实际项目中,定期分析慢查询日志、优化索引、调整配置参数,都能提升数据库性能。


   转载规则


《MySQL监控与性能指标》 小乐 采用 知识共享署名 4.0 国际许可协议 进行许可。
  目录