跳到主要内容
版本: 最新版本-3.5

array_sortby

根据另一个数组或从 lambda 表达式转换的数组中的元素升序对数组中的元素进行排序。有关详细信息,请参见Lambda 表达式。此函数从 v2.5 开始支持。

两个数组中的元素就像键值对。例如,b = [7,5,6] 是 a = [3,1,4] 的排序键。根据键值对关系,两个数组中的元素具有以下一对一映射。

Array元素 1元素 2元素 3
a314
b756

数组b按升序排序后,变为 [5,6,7]。数组a相应地变为 [1,4,3]。

Array元素 1元素 2元素 3
a143
b567

当参与排序的数组数量大于两个时,此函数的目的是基于多个数组列(例如 array1、array2、array3 等)中的值对 array0 进行排序。排序规则如下

首先,比较 array1 中对应的元素;如果它们相同,则比较 array2 中对应的元素;依此类推,直到最后一个数组列。

注意

参与排序的数组中的元素必须是可排序类型或 JSON 类型。参与排序的数组的大小必须与原始数组一致(NULL 值除外)。

示例说明

给定以下四个数组

array0 = [1, 2, 3, 4, 5]
array1 = [6, 5, 5, 5, 4]
array2 = ["d", "b", "a", "b", "4"]
array3 = ["2023-01-01", "2023-01-04", "2023-01-03", "2023-01-05", "2023-01-02"]

排序步骤

  1. 比较 array1
  • array1 的排序索引顺序为 [4, 2, 3, 1, 0],因为 4 < 5 < 5 < 5 < 6。
  1. 对于 array1 中的元素 [5, 5, 5],比较 array2
  • array2 中 [a, b, b] 的排序顺序为 [2, 3, 4],因为 a < b = b。
  1. 对于 array1 和 array2 中的元素 [b, b],比较 array3
  • array3 中 [2023-01-04, 2023-01-05] 的排序顺序为 [2, 4],因为 2023-01-04 < 2023-01-05。

array0 的最终排序结果

array0 = [5, 3, 2, 4, 1]

语法

array_sortby(array0, array1)
array_sortby(<lambda function>, array0 [, array1...])
array_sortby(array0, array1, [array2, array3...])
  • array_sortby(array0, array1)

    根据array1的升序对array0进行排序。

  • array_sortby(<lambda 函数>, array0 [, array1...])

    根据 lambda 函数返回的数组对array0进行排序。

  • array_sortby(array0, array1, [array2, array3...])

    基于多个数组列(例如 array1、array2、array3 等)的值按升序对 array0 进行排序。排序规则是:首先,比较 array1 的相应元素;如果它们相同,则比较 array2 的相应元素;依此类推,直到最后一个数组列。

参数

  • array0:要排序的数组。它必须是数组、数组表达式或null。数组中的元素必须是可排序的。
  • array1:用于对array0进行排序的排序数组。它必须是数组、数组表达式或null
  • lambda 函数:用于生成排序数组的 lambda 表达式。
  • array1, [array2, array3...]:用于对array0进行排序的排序数组。它必须是数组、数组表达式或null

返回值

返回一个数组。

使用说明

  • 此函数只能按升序对数组的元素进行排序。
  • NULL值放置在返回的数组的开头。
  • 如果要按降序对数组的元素进行排序,请使用reverse函数。
  • 如果排序数组 (array1) 为 null,则array0中的数据保持不变。
  • 返回的数组的元素与array0的元素具有相同的数据类型。null 值的属性也相同。
  • 所有数组必须具有相同数量的元素。否则,将返回错误。

示例

下表用于演示如何使用此函数。

CREATE TABLE `test_array` (
`c1` int(11) NULL COMMENT "",
`c2` ARRAY<int(11)> NULL COMMENT "",
`c3` ARRAY<int(11)> NULL COMMENT ""
) ENGINE=OLAP
DUPLICATE KEY(`c1`)
COMMENT "OLAP"
DISTRIBUTED BY HASH(`c1`)
PROPERTIES (
"replication_num" = "3",
"storage_format" = "DEFAULT",
"enable_persistent_index" = "true",
"compression" = "LZ4"
);

insert into test_array values
(1,[4,3,5],[82,1,4]),
(2,null,[23]),
(3,[4,2],[6,5]),
(4,null,null),
(5,[],[]),
(6,NULL,[]),
(7,[],null),
(8,[null,null],[3,6]),
(9,[432,21,23],[5,4,null]);

select * from test_array order by c1;
+------+-------------+------------+
| c1 | c2 | c3 |
+------+-------------+------------+
| 1 | [4,3,5] | [82,1,4] |
| 2 | NULL | [23] |
| 3 | [4,2] | [6,5] |
| 4 | NULL | NULL |
| 5 | [] | [] |
| 6 | NULL | [] |
| 7 | [] | NULL |
| 8 | [null,null] | [3,6] |
| 9 | [432,21,23] | [5,4,null] |
+------+-------------+------------+
9 rows in set (0.00 sec)

示例 1:根据c2c3进行排序。此示例还提供了 array_sort() 的结果以进行比较。

select c1, c3, c2, array_sort(c2), array_sortby(c3,c2)
from test_array order by c1;
+------+------------+-------------+----------------+----------------------+
| c1 | c3 | c2 | array_sort(c2) | array_sortby(c3, c2) |
+------+------------+-------------+----------------+----------------------+
| 1 | [82,1,4] | [4,3,5] | [3,4,5] | [1,82,4] |
| 2 | [23] | NULL | NULL | [23] |
| 3 | [6,5] | [4,2] | [2,4] | [5,6] |
| 4 | NULL | NULL | NULL | NULL |
| 5 | [] | [] | [] | [] |
| 6 | [] | NULL | NULL | [] |
| 7 | NULL | [] | [] | NULL |
| 8 | [3,6] | [null,null] | [null,null] | [3,6] |
| 9 | [5,4,null] | [432,21,23] | [21,23,432] | [4,null,5] |
+------+------------+-------------+----------------+----------------------+

示例 2:基于从 lambda 表达式生成的c2对数组c3进行排序。此示例等效于示例 1。它还提供了 array_sort() 的结果以进行比较。

select
c1,
c3,
c2,
array_sort(c2) as sorted_c2_asc,
array_sortby((x,y) -> y, c3, c2) as sorted_c3_by_c2
from test_array order by c1;
+------+------------+-------------+---------------+-----------------+
| c1 | c3 | c2 | sorted_c2_asc | sorted_c3_by_c2 |
+------+------------+-------------+---------------+-----------------+
| 1 | [82,1,4] | [4,3,5] | [3,4,5] | [82,1,4] |
| 2 | [23] | NULL | NULL | [23] |
| 3 | [6,5] | [4,2] | [2,4] | [5,6] |
| 4 | NULL | NULL | NULL | NULL |
| 5 | [] | [] | [] | [] |
| 6 | [] | NULL | NULL | [] |
| 7 | NULL | [] | [] | NULL |
| 8 | [3,6] | [null,null] | [null,null] | [3,6] |
| 9 | [5,4,null] | [432,21,23] | [21,23,432] | [4,null,5] |
+------+------------+-------------+---------------+-----------------+

示例 3:基于c2+c3的升序对数组c3进行排序。

select
c3,
c2,
array_map((x,y)-> x+y,c3,c2) as sum,
array_sort(array_map((x,y)-> x+y, c3, c2)) as sorted_sum,
array_sortby((x,y) -> x+y, c3, c2) as sorted_c3_by_sum
from test_array where c1=1;
+----------+---------+----------+------------+------------------+
| c3 | c2 | sum | sorted_sum | sorted_c3_by_sum |
+----------+---------+----------+------------+------------------+
| [82,1,4] | [4,3,5] | [86,4,9] | [4,9,86] | [1,4,82] |
+----------+---------+----------+------------+------------------+
CREATE TABLE test_array_sortby_muliti (
id INT(11) not null,
array_col1 ARRAY<INT>,
array_col2 ARRAY<DOUBLE>,
array_col3 ARRAY<VARCHAR(20)>,
array_col4 ARRAY<DATE>
) ENGINE=OLAP
DUPLICATE KEY(id)
COMMENT "OLAP"
DISTRIBUTED BY HASH(id)
PROPERTIES (
"replication_num" = "1",
"storage_format" = "DEFAULT",
"enable_persistent_index" = "true",
"compression" = "LZ4"
);

INSERT INTO test_array_sortby_multi VALUES
(1, [4, 3, 5], [1.1, 2.2, 2.2], ['a', 'b', 'c'], ['2023-01-01', '2023-01-02', '2023-01-03']),
(2, [6, 7, 8], [6.6, 5.5, 6.6], ['d', 'e', 'd'], ['2023-01-04', '2023-01-05', '2023-01-06']),
(3, NULL, [7.7, 8.8, 8.8], ['g', 'h', 'h'], ['2023-01-07', '2023-01-08', '2023-01-09']),
(4, [9, 10, 11], NULL, ['k', 'k', 'j'], ['2023-01-10', '2023-01-12', '2023-01-11']),
(5, [12, 13, 14], [10.10, 11.11, 11.11], NULL, ['2023-01-13', '2023-01-14', '2023-01-15']),
(6, [15, 16, 17], [14.14, 13.13, 14.14], ['m', 'o', 'o'], NULL),
(7, [18, 19, 20], [16.16, 16.16, 18.18], ['p', 'p', 'r'], ['2023-01-16', NULL, '2023-01-18']),
(8, [21, 22, 23], [19.19, 20.20, 19.19], ['a', 't', 'a'], ['2023-01-19', '2023-01-20', '2023-01-21']),
(9, [24, 25, 26], NULL, ['y', 'y', 'z'], ['2023-01-25', '2023-01-24', '2023-01-26']),
(10, [24, 25, 26], NULL, ['y', 'y', 'z'], ['2023-01-25', NULL, '2023-01-26']);


select * from test_array_sortby_multi order by id asc;
+------+------------+---------------------+---------------+------------------------------------------+
| id | array_col1 | array_col2 | array_col3 | array_col4 |
+------+------------+---------------------+---------------+------------------------------------------+
| 1 | [4,3,5] | [1.1,2.2,2.2] | ["a","b","c"] | ["2023-01-01","2023-01-02","2023-01-03"] |
| 2 | [6,7,8] | [6.6,5.5,6.6] | ["d","e","d"] | ["2023-01-04","2023-01-05","2023-01-06"] |
| 3 | NULL | [7.7,8.8,8.8] | ["g","h","h"] | ["2023-01-07","2023-01-08","2023-01-09"] |
| 4 | [9,10,11] | NULL | ["k","k","j"] | ["2023-01-10","2023-01-12","2023-01-11"] |
| 5 | [12,13,14] | [10.1,11.11,11.11] | NULL | ["2023-01-13","2023-01-14","2023-01-15"] |
| 6 | [15,16,17] | [14.14,13.13,14.14] | ["m","o","o"] | NULL |
| 7 | [18,19,20] | [16.16,16.16,18.18] | ["p","p","r"] | ["2023-01-16",null,"2023-01-18"] |
| 8 | [21,22,23] | [19.19,20.2,19.19] | ["a","t","a"] | ["2023-01-19","2023-01-20","2023-01-21"] |
| 9 | [24,25,26] | NULL | ["y","y","z"] | ["2023-01-25","2023-01-24","2023-01-26"] |
| 10 | [24,25,26] | NULL | ["y","y","z"] | ["2023-01-25",null,"2023-01-26"] |
+------+------------+---------------------+---------------+------------------------------------------+

示例 1:根据array_col2array_col3array_col1进行排序。

select id, array_col1, array_col2, array_col3, array_sortby(array_col1, array_col2, array_col3) from test_array_sortby_multi order by id asc;
+------+------------+---------------------+---------------+--------------------------------------------------------+
| id | array_col1 | array_col2 | array_col3 | array_sortby(array_col1, array_col2, array_col3) |
+------+------------+---------------------+---------------+--------------------------------------------------------+
| 1 | [4,3,5] | [1.1,2.2,2.2] | ["a","b","c"] | [4,3,5] |
| 2 | [6,7,8] | [6.6,5.5,6.6] | ["d","e","d"] | [7,6,8] |
| 3 | NULL | [7.7,8.8,8.8] | ["g","h","h"] | NULL |
| 4 | [9,10,11] | NULL | ["k","k","j"] | [11,9,10] |
| 5 | [12,13,14] | [10.1,11.11,11.11] | NULL | [12,13,14] |
| 6 | [15,16,17] | [14.14,13.13,14.14] | ["m","o","o"] | [16,15,17] |
| 7 | [18,19,20] | [16.16,16.16,18.18] | ["p","p","r"] | [18,19,20] |
| 8 | [21,22,23] | [19.19,20.2,19.19] | ["a","t","a"] | [21,23,22] |
| 9 | [24,25,26] | NULL | ["y","y","z"] | [24,25,26] |
| 10 | [24,25,26] | NULL | ["y","y","z"] | [24,25,26] |
+------+------------+---------------------+---------------+--------------------------------------------------------+

示例 2:根据array_col2array_col3array_col4array_col1进行排序。

select id, array_col1, array_col2, array_col3, array_col4, array_sortby(array_col1, array_col2, array_col3, array_col4) from test_array_sortby_multi order by id asc;
+------+------------+---------------------+---------------+------------------------------------------+--------------------------------------------------------------------+
| id | array_col1 | array_col2 | array_col3 | array_col4 | array_sortby(array_col1, array_col2, array_col3, array_col4) |
+------+------------+---------------------+---------------+------------------------------------------+--------------------------------------------------------------------+
| 1 | [4,3,5] | [1.1,2.2,2.2] | ["a","b","c"] | ["2023-01-01","2023-01-02","2023-01-03"] | [4,3,5] |
| 2 | [6,7,8] | [6.6,5.5,6.6] | ["d","e","d"] | ["2023-01-04","2023-01-05","2023-01-06"] | [7,6,8] |
| 3 | NULL | [7.7,8.8,8.8] | ["g","h","h"] | ["2023-01-07","2023-01-08","2023-01-09"] | NULL |
| 4 | [9,10,11] | NULL | ["k","k","j"] | ["2023-01-10","2023-01-12","2023-01-11"] | [11,9,10] |
| 5 | [12,13,14] | [10.1,11.11,11.11] | NULL | ["2023-01-13","2023-01-14","2023-01-15"] | [12,13,14] |
| 6 | [15,16,17] | [14.14,13.13,14.14] | ["m","o","o"] | NULL | [16,15,17] |
| 7 | [18,19,20] | [16.16,16.16,18.18] | ["p","p","r"] | ["2023-01-16",null,"2023-01-18"] | [19,18,20] |
| 8 | [21,22,23] | [19.19,20.2,19.19] | ["a","t","a"] | ["2023-01-19","2023-01-20","2023-01-21"] | [21,23,22] |
| 9 | [24,25,26] | NULL | ["y","y","z"] | ["2023-01-25","2023-01-24","2023-01-26"] | [25,24,26] |
| 10 | [24,25,26] | NULL | ["y","y","z"] | ["2023-01-25",null,"2023-01-26"] | [25,24,26] |
+------+------------+---------------------+---------------+------------------------------------------+--------------------------------------------------------------------+

参考

array_sort