SQL HAVING 子句使用详解

HAVING 是 SQL 中用于过滤分组结果的关键字,它与 WHERE 类似但作用于聚合后的数据。以下是全面指南:

基本语法

1
2
3
4
SELECT1, 聚合函数(列2)
FROM 表名
GROUP BY1
HAVING 聚合函数(列2) 条件;

与 WHERE 的区别

特性 WHERE HAVING
执行时机 在分组前过滤原始数据 在分组后过滤聚合结果
可用字段 原始列 聚合列或GROUP BY列
性能 更高效(减少处理量) 相对较低

使用场景

1. 过滤聚合结果

1
2
3
4
5
-- 查找平均分数大于80的班级
SELECT class, AVG(score) as avg_score
FROM students
GROUP BY class
HAVING AVG(score) > 80;

2. 结合多个聚合条件

1
2
3
4
5
6
7
8
-- 找出总销售额超过1万且订单数大于50的销售员
SELECT salesperson,
SUM(amount) as total_sales,
COUNT(*) as order_count
FROM orders
GROUP BY salesperson
HAVING SUM(amount) > 10000
AND COUNT(*) > 50;

3. 使用别名过滤

1
2
3
4
5
-- 使用列别名(注意:并非所有数据库都支持)
SELECT department, AVG(salary) as avg_salary
FROM employees
GROUP BY department
HAVING avg_salary > 5000;

实际案例

案例1:电商分析

1
2
3
4
5
6
7
8
-- 找出月销售额超过5万的产品类别
SELECT category,
SUM(price * quantity) as monthly_sales
FROM order_details
WHERE order_date BETWEEN '2023-01-01' AND '2023-01-31'
GROUP BY category
HAVING SUM(price * quantity) > 50000
ORDER BY monthly_sales DESC;

案例2:社交网络分析

1
2
3
4
5
6
7
8
9
-- 查询拥有超过1000粉丝且互动率高于5%的用户
SELECT user_id,
COUNT(follower_id) as follower_count,
(SUM(likes)/COUNT(posts)) as engagement_rate
FROM users
LEFT JOIN posts USING(user_id)
GROUP BY user_id
HAVING COUNT(follower_id) > 1000
AND (SUM(likes)/COUNT(posts)) > 0.05;

常见错误

错误1:对非聚合列使用HAVING

1
2
3
4
5
-- 错误:name不是聚合列
SELECT name, AVG(score)
FROM students
GROUP BY class
HAVING name LIKE '张%';-- 应该用WHERE

错误2:混淆WHERE和HAVING的执行顺序

1
2
3
4
5
-- 错误逻辑:想先过滤再分组,但实际执行顺序相反
SELECT department, COUNT(*)
FROM employees
HAVING salary > 5000-- 执行时salary已不可见
GROUP BY department;

正确理解执行顺序:WHERE → GROUP BY → HAVING → ORDER BY → LIMIT


高级用法

1. 嵌套聚合过滤

1
2
3
4
5
6
7
8
9
-- 找出各部门中高于部门平均工资的员工
SELECT e.department, e.name, e.salary
FROM employees e
JOIN (
SELECT department, AVG(salary) as dept_avg
FROM employees
GROUP BY department
) d ON e.department = d.department
WHERE e.salary > d.dept_avg;

2. 与CASE WHEN结合

1
2
3
4
5
6
7
8
-- 分类统计并筛选
SELECT
product_type,
COUNT(CASE WHEN status = 'active' THEN 1 END) as active_count,
COUNT(CASE WHEN status = 'inactive' THEN 1 END) as inactive_count
FROM products
GROUP BY product_type
HAVING COUNT(CASE WHEN status = 'active' THEN 1 END) > 10;

性能优化建议

  1. 优先用WHERE过滤:能先用WHERE过滤的数据不要留到HAVING

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    -- 优化前
    SELECT user_id, COUNT(*)
    FROM logs
    GROUP BY user_id
    HAVING COUNT(*) > 100;

    -- 优化后(假设有活跃用户标记)
    SELECT user_id, COUNT(*)
    FROM logs
    WHERE is_active = 1-- 先过滤活跃用户
    GROUP BY user_id
    HAVING COUNT(*) > 100;
  2. 避免复杂计算:HAVING中的计算尽量简单

  3. 索引利用:确保GROUP BY的列有适当索引