approx_top_k
返回 expr
中出现频率最高的 k
个 item 值及其近似计数。
此函数从 v3.0 版本开始支持。
语法
APPROX_TOP_K(<expr> [ , <k> [ , <counter_num> ] ] )
参数
-
expr
: STRING、BOOLEAN、DATE、DATETIME 或数值类型的表达式。 -
k
:可选的 INTEGER 字面值,大于 0。如果未指定k
,则默认为5
。最大值为100000
。 -
counter_num
:可选的 INTEGER 字面值,大于或等于k
。counter_num
越大,结果越准确。但是,这也会增加 CPU 和内存成本。- 最大值为
100000
。 - 如果未指定
counter_num
,则默认为max(min(2 * k, 100), 100000)
。
- 最大值为
返回值
结果作为 STRUCT 类型的 ARRAY 返回,其中每个 STRUCT 包含一个用于值的 item
字段(具有其原始输入类型)和一个 count
字段(BIGINT 类型),其中包含近似的出现次数。数组按 count
降序排序。
聚合函数返回表达式 expr 中出现频率最高的 k
个 item 值及其近似计数。每个计数的误差可能高达 2.0 * numRows / counter_num
,其中 numRows
是总行数。较高的 counter_num
值以增加内存使用量为代价提供更好的准确性。项目数少于 counter_num
的表达式将产生精确的项目计数。结果包括 NULL
值作为结果中自己的项目。
示例
使用scores表中的数据作为示例。
-- Calculate the score distribution of each subject.
MySQL > SELECT subject, APPROX_TOP_K(score) AS top_k FROM scores GROUP BY subject;
+---------+--------------------------------------------------------------------------------------------------------------------+
| subject | top_k |
+---------+--------------------------------------------------------------------------------------------------------------------+
| physics | [{"item":99,"count":2},{"item":null,"count":1},{"item":100,"count":1},{"item":85,"count":1},{"item":60,"count":1}] |
| english | [{"item":null,"count":1},{"item":92,"count":1},{"item":98,"count":1},{"item":100,"count":1},{"item":85,"count":1}] |
| NULL | [{"item":90,"count":1}] |
| math | [{"item":80,"count":2},{"item":null,"count":1},{"item":92,"count":1},{"item":95,"count":1},{"item":70,"count":1}] |
+---------+--------------------------------------------------------------------------------------------------------------------+
-- Calculate the score distribution of the math subject.
MySQL > SELECT subject, APPROX_TOP_K(score) AS top_k FROM scores WHERE subject IN ('math') GROUP BY subject;
+---------+-------------------------------------------------------------------------------------------------------------------+
| subject | top_k |
+---------+-------------------------------------------------------------------------------------------------------------------+
| math | [{"item":80,"count":2},{"item":null,"count":1},{"item":95,"count":1},{"item":92,"count":1},{"item":70,"count":1}] |
+---------+-------------------------------------------------------------------------------------------------------------------+
关键词
APPROX_TOP_K