【ElasticSearch】（五）“Result window is too large 深度分页”的利弊权衡

本文主要是介绍【ElasticSearch】（五）“Result window is too large 深度分页”的利弊权衡，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

如题，在使用elastic search的dsl查询过程中，遇到了如下问题：

{"error": {"root_cause": [{"type": "query_phase_execution_exception","reason": "Result window is too large, from + size must be less than or equal to: [200] but was [1000]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level parameter."}],"type": "search_phase_execution_exception","reason": "all shards failed","phase": "query","grouped": true,"failed_shards": [{"shard": 0,"index": "fcar_city","node": "7EtAlFI7QEOpQD3rHvTm0g","reason": {"type": "query_phase_execution_exception","reason": "Result window is too large, from + size must be less than or equal to: [200] but was [1000]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level parameter."}}]},"status": 500
}

比较不解，我的dsl语句是这样：

{"query": {"bool": {"must": [{"match_all": {} } ] } },"from": 0,"size": 1000 
}

仅仅是对“fcar_city”这一个索引，做了“match_all”查询，结果：result windows is too large.很不解。网上搜索，大致的解决方案，是通过修改“max_result_window”，比预设的size值大即可，比如：

PUT fcar_city/_settings
{"index":{"max_result_window":1000000}
}

我对fcar_city索引重设max_result_window属性，之后dsl查询成功。

过程中在stackoverflow上看到一个帖子，直接修改上述属性会导致一些问题，比如 high memory consumption，这里牵扯到一个概念“deep paging”（深度分页），es官方对其介绍：

https://www.elastic.co/guide/en/elasticsearch/guide/current/pagination.html

https://www.elastic.co/guide/en/elasticsearch/guide/current/_fetch_phase.html

介绍分页：

1.es要实现mysql中limit的效果，通过from size来做。

size ：指示应返回的结果数，默认为 10

from ：指示应跳过的初始结果数，默认为 0

举例，每页现实5条记录，分3页，分别获取第1～3页的内容：

GET / _search ？size = 5 
GET / _search ？size = 5 ＆from = 5 
GET / _search ？size = 5 ＆from = 10

之所以说调大max_result_window会导致high memory consumption，从根上讲，搜索请求通常跨越多个分片，每个分片都会生成自己的排序结果，然后需要对其进行集中排序以确保整体顺序正确。

如果分页太深或一次请求太多结果（max_result_window调大），假设我们在一个索引中搜索五个主分片，当我们请求结果的第一页（结果1到10）时，每个分片产生它自己的前10个结果并将它们返回到协调节点，然后协调节点对所有50个结果进行排序以选择整个前10个。现在想象我们要求第1,000页 - 即结果（10,001到10,010）。一切都以相同的方式工作，每个分片产生其前10,010个结果。然后，协调节点对所有50,050个结果进行排序，并丢弃其中的50,040个结果！可见，在分布式系统中，排序结果的成本随着页面越深而呈指数级增长。

除此之外，在分布式中执行搜索，获取阶段的过程如下：

1.协调节点识别需要获取哪些文档GET并向相关分片发出多请求。
2.如果需要， 每个分片都会加载文档并丰富它们，然后将文档返回到协调节点。
3.获取所有文档后，协调节点将结果返回给客户端。

协调节点首先决定实际需要获取哪些文档。例如，如果我们的查询指定{ "from": 90, "size": 10 }，前90个结果将被丢弃，只需要检索接下来的10个结果。这些文档可能来自原始搜索请求中涉及的一个，部分或全部分片。一旦协调节点收到所有结果，它就会将它们组装成一个返回给客户端的响应。

在fetch-phrase过程中，多个分片上会涉及到深度分页：

query-then-fetch进程支持使用from和size 参数进行分页，但是在限制范围内。请记住，每个分片必须构建一个长度优先级队列from + size，所有这些队列都需要传递回协调节点。并且协调节点需要对 number_of_shards * (from + size)文档进行排序以便找到正确的 size文档。根据文档的大小，分片数量以及硬件，分页10,000到50,000个结果（1,000到5,000页）深度应该是完全可行的。但是，如果使用足够大的from值，则使用大量的CPU，内存和带宽，排序过程会变得非常沉重。

所以说，解决“Result window is too large, from + size must be less than or equal to: [200] but was [1000]”这样的问题，偷懒的话，设置max_result_window满足业务需求，但是影响了集群的性能。如果想要避免deep paging导致的high memory consumption问题，请参考下一篇博客。关于scroll api.

这篇关于【ElasticSearch】（五）“Result window is too large 深度分页”的利弊权衡的文章就介绍到这儿，希望我们推荐的文章对编程师们有所帮助！