本文主要是介绍ES多键聚合桶个数计数问题,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
ES多键聚合桶个数计数问题
开发验证过程中,ElasticSearch聚合时不显示桶的个数,在进行数据核对时非常麻烦。这里有几个解决方案:
java代码中计数
java代码中发送查询后,返回response,buckets返回是一个数组,可以获取数组的大小,即聚合桶的数量。我知道这个解决方案可能被喷。
ES使用查询语句计数
GET cn_order*/_search
{"size":0,"aggregations": {"groupby": {"terms": {"script": {"inline": "doc['order_id'].value+'-split-'+doc['merchant_id'].value"},"size": 200},"aggregations": {"marketFee": {"sum": {"field": "market_fee"}}}}}
}
使用terms,加script的好处是,即可以单键聚合,也可以多键聚合。返回值示例:
{"took": 2,"timed_out": false,"_shards": {"total": 90,"successful": 90,"failed": 0},"hits": {"total": 4,"max_score": 0,"hits": []},"aggregations": {"groupby": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"doc_count": 2,"marketFee": {"value": 4.2},"key": "1437002_76-split-123”},{"doc_count": 1,"marketFee": {"value": 2.1},"key": "1437002_77-split-234”},{"doc_count": 1,"marketFee": {"value": 2.1},"key": “123_7759-split-345”}]}}
}
其中key的构造即为书写的doc[‘order_id’].value+‘-split-’+doc[‘merchant_id’].value格式。中括号内的引号里包含键值。
回归正题,如何对整个返回结果的桶个数进行计数呢?
可以使用辅助手段,在查询结果里构造一个key个数的计数,使之为1,然后在对这个计数进行汇总即可。直接上查询语句:
GET cn_energy_charge_bill*/_search
{"size": 0, "aggregations": {“keycount": {"sum_bucket": {"buckets_path": "groupby>uniqueId"}},"groupby": {"terms": {"script": {"inline": "doc[‘order_id'].value+'-split-'+doc[’merchant_id’].value"},"size": 200}, "aggregations": {"marketFee": {"sum": {"field": "market_fee"}},"uniqueId": {"cardinality":{"script": {"inline": "doc[‘order_id'].value+'-split-'+doc[‘merchant_id’].value"}}}}}}
}
重点在聚合桶外部还有一个桶个数的聚合:在聚合桶的查询语句中增加一个key维度的聚合,并且进行cardinality去重,所以对于每个单独的桶,key只有一个,这里聚合结果只会是1;然后对key“uniqueId”进行二次汇总聚合,这个汇总即为桶的个数。查询结果如下:
{"took": 2,"timed_out": false,"_shards": {"total": 90,"successful": 90,"failed": 0},"hits": {"total": 4,"max_score": 0,"hits": []},"aggregations": {"keycount": {"value": 3},"aggregations": {"groupby": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"doc_count": 2,"marketFee": {"value": 4.2},"key": "1437002_76-split-123”},{"doc_count": 1,"marketFee": {"value": 2.1},"key": "1437002_77-split-234”},{"doc_count": 1,"marketFee": {"value": 2.1},"key": “123_7759-split-345”,"uniqueId": {"value": 1}}]}}
}
keycount的value即为有多少个桶,也可以看到uniqueId的值为1。这里有一个唯一的缺点是,查询时聚合的size值要设的足够大。我的查询设置为200,最后结果只有3个桶,所有桶计数是3.但是如果桶个数超过200个,那么显示200个桶,计数就是200。也就是说桶计数是基于查询出来的桶的个数,如果要显示所有桶的个数,显示的聚合桶的个数设置要大于桶聚合结果的个数。
这篇关于ES多键聚合桶个数计数问题的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!