从补丁 198 开始，Amazon Redshift 将不再支持创建新的 Python UDF。现有的 Python UDF 将继续正常运行至 2026 年 6 月 30 日。有关更多信息，请参阅[博客文章](https://aws.amazon.com/blogs/big-data/amazon-redshift-python-user-defined-functions-will-reach-end-of-support-after-june-30-2026/)。

# 聚合扩展
<a name="r_GROUP_BY_aggregation-extensions"></a>

Amazon Redshift 支持聚合扩展，以便在单个语句中完成多个 GROUP BY 操作。

 聚合扩展的示例使用 `orders` 表，该表包含电子公司的销售数据。您可以使用以下项创建 `orders`。

```
CREATE TABLE ORDERS (
    ID INT,
    PRODUCT CHAR(20),
    CATEGORY CHAR(20),
    PRE_OWNED CHAR(1),
    COST DECIMAL
);

INSERT INTO ORDERS VALUES
    (0, 'laptop',       'computers',    'T', 1000),
    (1, 'smartphone',   'cellphones',   'T', 800),
    (2, 'smartphone',   'cellphones',   'T', 810),
    (3, 'laptop',       'computers',    'F', 1050),
    (4, 'mouse',        'computers',    'F', 50);
```

## *GROUPING SETS*
<a name="r_GROUP_BY_aggregation-extensions-grouping-sets"></a>

 在单个语句中计算一个或多个分组集。分组集是单个 GROUP BY 子句的集合，这是一组 0 列或更多列，您可以通过这些列对查询的结果集进行分组。GROUP BY GROUPING SETS 等效于对一个按不同列分组的结果集运行 UNION ALL 查询。例如，GROUP BY GROUPING SETS((a), (b)) 等效于 GROUP BY a UNION ALL GROUP BY b。

 以下示例返回按产品类别和所售产品类型分组的订单表产品的成本。

```
SELECT category, product, sum(cost) as total
FROM orders
GROUP BY GROUPING SETS(category, product);

       category       |       product        | total
----------------------+----------------------+-------
 computers            |                      |  2100
 cellphones           |                      |  1610
                      | laptop               |  2050
                      | smartphone           |  1610
                      | mouse                |    50

(5 rows)
```

## *ROLLUP*
<a name="r_GROUP_BY_aggregation-extensions-rollup"></a>

 假设在一个层次结构中，前面的列被视为后续列的父列。ROLLUP 按提供的列对数据进行分组，除了分组行之外，还返回额外的小计行，表示所有分组列级别的总计。例如，您可以使用 GROUP BY ROLLUP((a), (b)) 返回先按 a 分组的结果集，然后在假设 b 是 a 的一个子部分的情况下按 b 分组。ROLLUP 还会返回包含整个结果集而不包含分组列的行。

GROUP BY ROLLUP((a), (b)) 等效于 GROUP BY GROUPING SETS((a,b), (a), ())。

以下示例返回先按类别分组，然后按产品分组，且产品是类别细分项的订单表产品的成本。

```
SELECT category, product, sum(cost) as total
FROM orders
GROUP BY ROLLUP(category, product) ORDER BY 1,2;

       category       |       product        | total
----------------------+----------------------+-------
 cellphones           | smartphone           |  1610
 cellphones           |                      |  1610
 computers            | laptop               |  2050
 computers            | mouse                |    50
 computers            |                      |  2100
                      |                      |  3710
(6 rows)
```

## *CUBE*
<a name="r_GROUP_BY_aggregation-extensions-cube"></a>

 按提供的列对数据进行分组，除了分组行之外，还返回额外的小计行，表示所有分组列级别的总计。CUBE 返回与 ROLLUP 相同的行，同时为 ROLLUP 未涵盖的每个分组列组合添加额外的小计行。例如，您可以使用 GROUP BY CUBE ((a), (b)) 返回先按 a 分组的结果集，然后在假设 b 是 a 的一个子部分的情况下按 b 分组，再然后是单独按 b 分组的结果集。CUBE 还会返回包含整个结果集而不包含分组列的行。

GROUP BY CUBE((a), (b)) 等效于 GROUP BY GROUPING SETS((a, b), (a), (b), ())。

以下示例返回先按类别分组，然后按产品分组，且产品是类别细分项的订单表产品的成本。与前面的 ROLLUP 示例不同，该语句返回每个分组列组合的结果。

```
SELECT category, product, sum(cost) as total
FROM orders
GROUP BY CUBE(category, product) ORDER BY 1,2;

       category       |       product        | total
----------------------+----------------------+-------
 cellphones           | smartphone           |  1610
 cellphones           |                      |  1610
 computers            | laptop               |  2050
 computers            | mouse                |    50
 computers            |                      |  2100
                      | laptop               |  2050
                      | mouse                |    50
                      | smartphone           |  1610
                      |                      |  3710
(9 rows)
```

## *GROUPING/GROUPING\$1ID 函数*
<a name="r_GROUP_BY_aggregation-extentions-grouping"></a>

 ROLLUP 和 CUBE 为结果集添加 NULL 值以表示小计行。例如，GROUP BY ROLLUP((a), (b)) 返回 b 分组列中值为 NULL 的一行或多行，以表明它们是 a 分组列中字段的小计。这些 NULL 值仅用于满足返回元组的格式。

 当您使用 ROLLUP 和 CUBE 对本身存储 NULL 值的关系运行 GROUP BY 操作时，这样生成的结果集会包含看起来具有相同分组列的行。返回前面的示例，如果 b 分组列包含存储的 NULL 值，则 GROUP BY ROLLUP((a), (b)) 返回 b 分组列中值为 NULL 且不是小计的行。

 要区分由 ROLLUP 和 CUBE 创建的 NULL 值和在表本身存储的 NULL 值，可以使用 GROUPING 函数或其别名 GROUPING\$1ID。GROUPING 采用单个分组集作为其参数，并且对于结果集中的每一行，返回与该位置的分组列对应的 0 或 1 位值，然后将该值转换为整数。如果该位置的值是由聚合扩展创建的 NULL 值，则 GROUPING 返回 1。对于所有其他值（包括存储的 NULL 值），该示例返回 0。

 例如，GROUPING(category, product) 为给定行返回以下值，具体取决于该行的分组列值。就本示例而言，表中的所有 NULL 值都是由聚合扩展创建的 NULL 值。

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/zh_cn/redshift/latest/dg/r_GROUP_BY_aggregation-extensions.html)

GROUPING 函数按照以下格式显示在查询的 SELECT 列表部分。

```
SELECT ... [GROUPING( expr )...] ...
  GROUP BY ... {CUBE | ROLLUP| GROUPING SETS} ( expr ) ...
```

以下示例与前面的 CUBE 示例相同，但为其分组集添加了 GROUPING 函数。

```
SELECT category, product,
       GROUPING(category) as grouping0,
       GROUPING(product) as grouping1,
       GROUPING(category, product) as grouping2,
       sum(cost) as total
FROM orders
GROUP BY CUBE(category, product) ORDER BY 3,1,2;

       category       |       product        | grouping0 | grouping1 | grouping2 | total
----------------------+----------------------+-----------+-----------+-----------+-------
 cellphones           | smartphone           |         0 |         0 |         0 |  1610
 cellphones           |                      |         0 |         1 |         1 |  1610
 computers            | laptop               |         0 |         0 |         0 |  2050
 computers            | mouse                |         0 |         0 |         0 |    50
 computers            |                      |         0 |         1 |         1 |  2100
                      | laptop               |         1 |         0 |         2 |  2050
                      | mouse                |         1 |         0 |         2 |    50
                      | smartphone           |         1 |         0 |         2 |  1610
                      |                      |         1 |         1 |         3 |  3710
(9 rows)
```

## *部分 ROLLUP 和 CUBE*
<a name="r_GROUP_BY_aggregation-extentions-partial"></a>

 您可以只使用小计的一部分来运行 ROLLUP 和 CUBE 操作。

 部分 ROLLUP 和 CUBE 操作的语法如下所示。

```
GROUP BY expr1, { ROLLUP | CUBE }(expr2, [, ...])
```

在这里，GROUP BY 子句仅在 *expr2* 及更高级别创建小计行。

以下示例显示了对订单表执行的部分 ROLLUP 和 CUBE 操作，首先按产品是否为二手产品进行分组，然后对 category 和 product 列运行 ROLLUP 和 CUBE。

```
SELECT pre_owned, category, product,
       GROUPING(category, product, pre_owned) as group_id,
       sum(cost) as total
FROM orders
GROUP BY pre_owned, ROLLUP(category, product) ORDER BY 4,1,2,3;

 pre_owned |       category       |       product        | group_id | total
-----------+----------------------+----------------------+----------+-------
 F         | computers            | laptop               |        0 |  1050
 F         | computers            | mouse                |        0 |    50
 T         | cellphones           | smartphone           |        0 |  1610
 T         | computers            | laptop               |        0 |  1000
 F         | computers            |                      |        2 |  1100
 T         | cellphones           |                      |        2 |  1610
 T         | computers            |                      |        2 |  1000
 F         |                      |                      |        6 |  1100
 T         |                      |                      |        6 |  2610
(9 rows)

SELECT pre_owned, category, product,
       GROUPING(category, product, pre_owned) as group_id,
       sum(cost) as total
FROM orders
GROUP BY pre_owned, CUBE(category, product) ORDER BY 4,1,2,3;

 pre_owned |       category       |       product        | group_id | total
-----------+----------------------+----------------------+----------+-------
 F         | computers            | laptop               |        0 |  1050
 F         | computers            | mouse                |        0 |    50
 T         | cellphones           | smartphone           |        0 |  1610
 T         | computers            | laptop               |        0 |  1000
 F         | computers            |                      |        2 |  1100
 T         | cellphones           |                      |        2 |  1610
 T         | computers            |                      |        2 |  1000
 F         |                      | laptop               |        4 |  1050
 F         |                      | mouse                |        4 |    50
 T         |                      | laptop               |        4 |  1000
 T         |                      | smartphone           |        4 |  1610
 F         |                      |                      |        6 |  1100
 T         |                      |                      |        6 |  2610
(13 rows)
```

由于 ROLLUP 和 CUBE 操作中不包括 pre-owned 列，因此不存在包括所有其他行的总计行。

## *连接分组*
<a name="r_GROUP_BY_aggregation-extentions-concat"></a>

 您可以连接多个 GROUPING SETS/ROLLUP/CUBE 子句来计算不同级别的小计。连接分组返回所提供分组集的笛卡尔乘积。

 连接 GROUPING SETS/ROLLUP/CUBE 子句的语法如下所示。

```
GROUP BY {ROLLUP|CUBE|GROUPING SETS}(expr1[, ...]),
         {ROLLUP|CUBE|GROUPING SETS}(expr1[, ...])[, ...]
```

考虑以下示例，看看小型连接分组如何生成大的最终结果集。

```
SELECT pre_owned, category, product,
       GROUPING(category, product, pre_owned) as group_id,
       sum(cost) as total
FROM orders
GROUP BY CUBE(category, product), GROUPING SETS(pre_owned, ())
ORDER BY 4,1,2,3;

 pre_owned |       category       |       product        | group_id | total
-----------+----------------------+----------------------+----------+-------
 F         | computers            | laptop               |        0 |  1050
 F         | computers            | mouse                |        0 |    50
 T         | cellphones           | smartphone           |        0 |  1610
 T         | computers            | laptop               |        0 |  1000
           | cellphones           | smartphone           |        1 |  1610
           | computers            | laptop               |        1 |  2050
           | computers            | mouse                |        1 |    50
 F         | computers            |                      |        2 |  1100
 T         | cellphones           |                      |        2 |  1610
 T         | computers            |                      |        2 |  1000
           | cellphones           |                      |        3 |  1610
           | computers            |                      |        3 |  2100
 F         |                      | laptop               |        4 |  1050
 F         |                      | mouse                |        4 |    50
 T         |                      | laptop               |        4 |  1000
 T         |                      | smartphone           |        4 |  1610
           |                      | laptop               |        5 |  2050
           |                      | mouse                |        5 |    50
           |                      | smartphone           |        5 |  1610
 F         |                      |                      |        6 |  1100
 T         |                      |                      |        6 |  2610
           |                      |                      |        7 |  3710
(22 rows)
```

## *嵌套分组*
<a name="r_GROUP_BY_aggregation-extentions-nested"></a>

 您可以使用 GROUPING SETS/ROLLUP/CUBE 操作作为 GROUPING SETS *expr* 来形成嵌套分组。展平嵌套 GROUPING SETS 内的子分组。

 嵌套分组的语法如下所示。

```
GROUP BY GROUPING SETS({ROLLUP|CUBE|GROUPING SETS}(expr[, ...])[, ...])
```

考虑以下示例。

```
SELECT category, product, pre_owned,
       GROUPING(category, product, pre_owned) as group_id,
       sum(cost) as total
FROM orders
GROUP BY GROUPING SETS(ROLLUP(category), CUBE(product, pre_owned))
ORDER BY 4,1,2,3;

       category       |       product        | pre_owned | group_id | total
----------------------+----------------------+-----------+----------+-------
 cellphones           |                      |           |        3 |  1610
 computers            |                      |           |        3 |  2100
                      | laptop               | F         |        4 |  1050
                      | laptop               | T         |        4 |  1000
                      | mouse                | F         |        4 |    50
                      | smartphone           | T         |        4 |  1610
                      | laptop               |           |        5 |  2050
                      | mouse                |           |        5 |    50
                      | smartphone           |           |        5 |  1610
                      |                      | F         |        6 |  1100
                      |                      | T         |        6 |  2610
                      |                      |           |        7 |  3710
                      |                      |           |        7 |  3710
(13 rows)
```

请注意，由于 ROLLUP(category) 和 CUBE(product, pre\$1owned) 都包含分组集 ()，因此表示总计的行是重复的。

## *使用说明*：
<a name="r_GROUP_BY_aggregation-extensions-usage-notes"></a>
+ GROUP BY 子句最多支持 64 个分组集。对于 ROLLUP 和 CUBE，或者 GROUPING SETS、ROLLUP 和 CUBE 的某种组合，此限制适用于隐含的分组集数。例如，GROUP BY CUBE((a), (b)) 计为 4 个分组集，而不是 2 个。
+ 使用聚合扩展时，不能使用常量作为分组列。
+ 无法创建包含重复列的分组集。