Elasticsearch-Metrics Aggregations(度量聚合/指标聚合)

2023-11-02 12:10

本文主要是介绍Elasticsearch-Metrics Aggregations(度量聚合/指标聚合),希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

前言

本文基于elasticsearch7.3.0版本
在这里插入图片描述

聚合的基本结构

"aggregations" : {"<aggregation_name>" : {"<aggregation_type>" : {<aggregation_body>}[,"meta" : {  [<meta_data_body>] } ]?[,"aggregations" : { [<sub_aggregation>]+ } ]?}[,"<aggregation_name_2>" : { ... } ]*
}

准备测试数据

PUT my_index
{"mappings": {"properties": {"tag": {"type": "keyword"},"price": {"type": "scaled_float","scaling_factor": 100}}}
}PUT my_index/_doc/1
{"tag": "没有价格的水果"
}PUT my_index/_doc/2
{"tag": "橘子","price": "1.00"
}PUT my_index/_doc/3
{"tag": "苹果","price": "9.00"
}

avg,max,min,sum,value_count,stats,extended_stats

这几种聚合语法都差不太多,所以一起看

  • avg:平均值
  • max:最大值
  • min:最小值
  • sum:求和
  • value_count:总数
  • stats:一次性返回avg,max,min,sum,value_count
  • extended_stats:stats聚合的扩展

求水果价格的平均值

GET my_index/_search
{"size": 0,"aggs": {"price_avg": {"avg": {"field": "price",// 设置字段的缺省值"missing": 1}}}
}
{"took" : 1,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 3,"relation" : "eq"},"max_score" : 1.0,"hits" : []},"aggregations" : {"price_avg" : {"value" : 3.6666666666666665}}
}

使用脚本

GET my_index/_search
{"size": 0,"aggs": {"price_avg": {"avg": {"script": {"source": "doc['price']"}}}}
}

使用value script

GET my_index/_search
{"size": 0,"aggs": {"price_avg": {"avg": {"field": "price","script": {"lang": "painless","source": "_value * params.number","params": {"number": 1.5}}}}}
}

stats聚合

GET my_index/_search
{"size": 0,"aggs": {"price_stats": {"stats": {"field": "price"}}}
}
{"took" : 3,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 3,"relation" : "eq"},"max_score" : null,"hits" : [ ]},"aggregations" : {"price_stats" : {"count" : 2,"min" : 1.0,"max" : 9.0,"avg" : 5.0,"sum" : 10.0}}
}

cardinality

去重,去重的结果是近似值,并不是准确的

这个precision_threshold选项允许用内存来换取准确性,并定义了一个唯一的计数,在此计数以下的计数预计接近准确。在此值之上,计数可能变得更加模糊。最大支持值为40000,高于此数字的阈值将具有与阈值40000相同的效果。默认值是3000.

# 聚合tag去重数量
GET my_index/_search
{"size": 0,"aggs": {"tag_cardinality": {"cardinality": {"field": "tag","precision_threshold": 3000}}}
}

使用脚本
这个cardinality度量支持脚本,但是性能受到显著影响,因为散列需要动态计算

GET my_index/_search
{"size": 0,"aggs": {"tag_cardinality": {"cardinality": {"script": {"lang": "painless","source": "doc['tag']+' '+doc['price']"}}}}
}

percentiles

百分位聚合

GET my_index/_search
{"size": 0,"aggs": {"price_percentiles": {"percentiles": {// field必须是数字字段"field": "price"}}}
}

默认情况下,percentile度量将生成一系列百分位数:[ 1, 5, 25, 50, 75, 95, 99 ]
响应

{"took" : 2,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 3,"relation" : "eq"},"max_score" : null,"hits" : [ ]},"aggregations" : {"price_percentiles" : {"values" : {"1.0" : 1.0,"5.0" : 1.0,"25.0" : 1.0,"50.0" : 5.0,"75.0" : 9.0,"95.0" : 9.0,"99.0" : 9.0}}}
}

使用percents参数指定要计算的特定百分位数

GET my_index/_search
{"size": 0,"aggs": {"price_percentiles": {"percentiles": {"field": "price",// 以数组的方式返回"keyed": false,"percents": [95,99,99.99]}}}
}

响应

{"took" : 1,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 3,"relation" : "eq"},"max_score" : null,"hits" : [ ]},"aggregations" : {"price_percentiles" : {"values" : [{"key" : 95.0,"value" : 9.0},{"key" : 99.0,"value" : 9.0},{"key" : 99.99,"value" : 9.0}]}}
}

使用脚本

GET my_index/_search
{"size": 0,"aggs": {"price_percentiles": {"percentiles": {"field": "price","script": {"lang": "painless","source": "_value * params.number","params": {"number": 10}}}}}
}

percentile_ranks

和percentiles类似,可以指定百分位区间

GET my_index/_search
{"size": 0,"aggs": {"price_percentile_ranks": {"percentile_ranks": {// field必须是数字字段"field": "price","values": [90,99],"keyed": false,"script": {"lang": "painless","source": "_value * params.number","params": {"number": 10}}}}}
}

响应

{"took" : 1,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 3,"relation" : "eq"},"max_score" : null,"hits" : [ ]},"aggregations" : {"price_percentile_ranks" : {"values" : [{"key" : 90.0,"value" : 100.0},{"key" : 99.0,"value" : 100.0}]}}
}

top_hits

此聚合器将用作子聚合器,以便在每个桶中聚合最高匹配的文档

GET my_index/_search
{"size": 0,"aggs": {"tag_terms": {"terms": {"field": "tag","size": 10},"aggs": {"tag_top": {"top_hits": {"from": 0,"size": 10,"sort": [{"price": {"order": "desc"}}]}}}}}
}

响应

{"took" : 1,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 3,"relation" : "eq"},"max_score" : null,"hits" : [ ]},"aggregations" : {"tag_terms" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 0,"buckets" : [{"key" : "橘子","doc_count" : 1,"tag_top" : {"hits" : {"total" : {"value" : 1,"relation" : "eq"},"max_score" : null,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "2","_score" : null,"_source" : {"tag" : "橘子","price" : "1.00"},"sort" : [1.0]}]}}},{"key" : "没有价格的水果","doc_count" : 1,"tag_top" : {"hits" : {"total" : {"value" : 1,"relation" : "eq"},"max_score" : null,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "1","_score" : null,"_source" : {"tag" : "没有价格的水果"},"sort" : ["-Infinity"]}]}}},{"key" : "苹果","doc_count" : 1,"tag_top" : {"hits" : {"total" : {"value" : 1,"relation" : "eq"},"max_score" : null,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "3","_score" : null,"_source" : {"tag" : "苹果","price" : "9.00"},"sort" : [9.0]}]}}}]}}
}

这篇关于Elasticsearch-Metrics Aggregations(度量聚合/指标聚合)的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/330489

相关文章

Elasticsearch 在 Java 中的使用教程

《Elasticsearch在Java中的使用教程》Elasticsearch是一个分布式搜索和分析引擎,基于ApacheLucene构建,能够实现实时数据的存储、搜索、和分析,它广泛应用于全文... 目录1. Elasticsearch 简介2. 环境准备2.1 安装 Elasticsearch2.2 J

ElasticSearch+Kibana通过Docker部署到Linux服务器中操作方法

《ElasticSearch+Kibana通过Docker部署到Linux服务器中操作方法》本文介绍了Elasticsearch的基本概念,包括文档和字段、索引和映射,还详细描述了如何通过Docker... 目录1、ElasticSearch概念2、ElasticSearch、Kibana和IK分词器部署

Java实现Elasticsearch查询当前索引全部数据的完整代码

《Java实现Elasticsearch查询当前索引全部数据的完整代码》:本文主要介绍如何在Java中实现查询Elasticsearch索引中指定条件下的全部数据,通过设置滚动查询参数(scrol... 目录需求背景通常情况Java 实现查询 Elasticsearch 全部数据写在最后需求背景通常情况下

Java操作ElasticSearch的实例详解

《Java操作ElasticSearch的实例详解》Elasticsearch是一个分布式的搜索和分析引擎,广泛用于全文搜索、日志分析等场景,本文将介绍如何在Java应用中使用Elastics... 目录简介环境准备1. 安装 Elasticsearch2. 添加依赖连接 Elasticsearch1. 创

基于MySQL Binlog的Elasticsearch数据同步实践

一、为什么要做 随着马蜂窝的逐渐发展,我们的业务数据越来越多,单纯使用 MySQL 已经不能满足我们的数据查询需求,例如对于商品、订单等数据的多维度检索。 使用 Elasticsearch 存储业务数据可以很好的解决我们业务中的搜索需求。而数据进行异构存储后,随之而来的就是数据同步的问题。 二、现有方法及问题 对于数据同步,我们目前的解决方案是建立数据中间表。把需要检索的业务数据,统一放到一张M

Jenkins构建Maven聚合工程,指定构建子模块

一、设置单独编译构建子模块 配置: 1、Root POM指向父pom.xml 2、Goals and options指定构建模块的参数: mvn -pl project1/project1-son -am clean package 单独构建project1-son项目以及它所依赖的其它项目。 说明: mvn clean package -pl 父级模块名/子模块名 -am参数

图解可观测Metrics, tracing, and logging

最近在看Gophercon大会PPT的时候无意中看到了关于Metrics,Tracing和Logging相关的一篇文章,凑巧这些我基本都接触过,也是去年后半年到现在一直在做和研究的东西。从去年的关于Metrics的goappmonitor,到今年在排查问题时脑洞的基于log全链路(Tracing)追踪系统的设计,正好是对这三个话题的实践。这不禁让我对它们的关系进行思考:Metrics和Loggi

ElasticSearch的DSL查询⑤(ES数据聚合、DSL语法数据聚合、RestClient数据聚合)

目录 一、数据聚合 1.1 DSL实现聚合 1.1.1 Bucket聚合  1.1.2 带条件聚合 1.1.3 Metric聚合 1.1.4 总结 2.1 RestClient实现聚合 2.1.1 Bucket聚合 2.1.2 带条件聚合 2.2.3 Metric聚合 一、数据聚合 聚合(aggregations)可以让我们极其方便的实现对数据的统计、分析、运算。例如:

七、Maven继承和聚合关系、及Maven的仓库及查找顺序

1.继承   2.聚合   3.Maven的仓库及查找顺序

【docker】基于docker-compose 安装elasticsearch + kibana + ik分词器(8.10.4版本)

记录下,使用 docker-compose 安装 Elasticsearch 和 Kibana,并配置 IK 分词器,你可以按照以下步骤进行。此过程适用于 Elasticsearch 和 Kibana 8.10.4 版本。 安装 首先,在你的工作目录下创建一个 docker-compose.yml 文件,用于配置 Elasticsearch 和 Kibana 的服务。 version: