ElasticSearch——nested(mapping,query,aggregation)

2023-12-07 00:08

本文主要是介绍ElasticSearch——nested(mapping,query,aggregation),希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

ElasticSearch 包含nested字段类型,该类型的出现主要是由于对象数组类型的操作往往不能如我们预期,这主要是因为在Lucene内部没有对象的概念,所以ES将层级的JSON数据转化成扁平的键值对列表形式。

例如文档:

PUT my_index/my_type/5
{"owner" : "小李","family" : [{"call" : "dad","name" :  "李俊杰"},{"call" : "mom","name" :  "李翠莲"}]
}

转化之后实际上为:

PUT my_index/my_type/5
{"owner": "小李","family.call": ["dad","mom"],"family.name": ["李俊杰","李翠莲"]
}

这样的话 我们想搜索爸爸的名称为李翠莲 也能够搜索到结果,这显然和我们的预期不符。

然而ES中nested类型字段,允许对象数组中的每一个对象被独立的索引和查询

我们先做一个通常的处理方式,看看能够得到什么结果:

不使用nest的例子:

使用动态mapping,直接插入数据,对象数组mapping的数据类型是对象类型

第一步:添加数据:

PUT my_index/my_type/1
{"owner": "张三","family": [{"call": "dad","name": "张三爸"},{"call": "mom","name": "张三妈"}]
}
PUT my_index/my_type/2
{"owner" : "李四","family" : [{"call" : "dad","name" :  "李四爸"},{"call" : "mom","name" :  "李四妈"}]
}
PUT my_index/my_type/3
{"owner" : "王五","family" : [{"call" : "dad","name" :  "王五爸"},{"call" : "mom","name" :  "王五妈"}]
}PUT my_index/my_type/4
{"owner" : "赵六","family" : [{"call" : "dad","name" :  "赵六爸"},{"call" : "mom","name" :  "赵六妈"}]
}PUT my_index/my_type/5
{"owner" : "我","family" : [{"call" : "dad","name" :  "我老爸"},{"call" : "mom","name" :  "我老妈"}]
}
执行上面,对索引库my_index的表my_type添加五条数据

第二步:查看mapping结果

{"my_index": {"mappings": {"my_type": {"properties": {"family": {"properties": {"call": {"type": "text","fields": {"keyword": {"type": "keyword","ignore_above": 256}}},"name": {"type": "text","fields": {"keyword": {"type": "keyword","ignore_above": 256}}}}},"owner": {"type": "text","fields": {"keyword": {"type": "keyword","ignore_above": 256}}}}}}}
}
可以看到family是个对象类型包含两个属性。

第三步查询:

must+term查询

GET my_index/my_type/_search
{"query": {"bool": {"must": [{"term": {"family.name.keyword": "王五妈"}}, {"term": {"family.call.keyword": "dad"}}]}}
}
预期没有返回结果 因为没有一个爸爸的名字叫做“王五妈”

实际返回的结果如下:

{"took": 1,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 1,"max_score": 0.5753642,"hits": [{"_index": "my_index","_type": "my_type","_id": "3","_score": 0.5753642,"_source": {"owner": "王五","family": [{"call": "dad","name": "王五爸"},{"call": "mom","name": "王五妈"}]}}]}

第四步:聚合,得到所有爸爸的统计

GET my_index/my_type/_search?size=0
{"aggs": {"call": {"filter": {"term": {"family.call.keyword": "dad"}},"aggs": {"name": {"terms": {"field": "family.name.keyword","size": 10}}}}}
}
期望得到5个爸爸的统计结果,实际却把妈妈的名称也返回了

{"took": 1,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 5,"max_score": 0,"hits": []},"aggregations": {"call": {"doc_count": 5,"name": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "张三妈","doc_count": 1},{"key": "张三爸","doc_count": 1},{"key": "我老妈","doc_count": 1},{"key": "我老爸","doc_count": 1},{"key": "李四妈","doc_count": 1},{"key": "李四爸","doc_count": 1},{"key": "王五妈","doc_count": 1},{"key": "王五爸","doc_count": 1},{"key": "赵六妈","doc_count": 1},{"key": "赵六爸","doc_count": 1}]}}}
}
可见对于对象数组如果不做特殊处理的话,其结果是不符合预期的。


接下来进行nested相关的操作例子:

第一步:不能任由字段类型动态映射,需要改变对象数组类型映射成nested类型

可以建一个动态映射的模板:

在建模版之前,为了测试方便,把原来的上例子中的测试数据删除,重新开始:

删除执行操作:DELETE  my_index

PUT my_index
{"mappings": {"my_type": {"dynamic_templates": [{"object_as_nest": {"match_mapping_type": "object","mapping": {"type": "nested"}}}]}}
}
这样的话,对象类型就映射为nested类型,其他的字段依旧按照默认的动态映射。

第二步:添加数据(可批量也可以一条条插入)

这里就省略 

使用上面例子中的第一步的插入操作将数据添加进来

得到五条数据。

第三步:查看mapping结果:

{"my_index": {"mappings": {"my_type": {"dynamic_templates": [{"object_as_nest": {"match_mapping_type": "object","mapping": {"type": "nested"}}}],"properties": {"family": {"type": "nested","properties": {"call": {"type": "text","fields": {"keyword": {"type": "keyword","ignore_above": 256}}},"name": {"type": "text","fields": {"keyword": {"type": "keyword","ignore_above": 256}}}}},"owner": {"type": "text","fields": {"keyword": {"type": "keyword","ignore_above": 256}}}}}}}
}

可以看到family的类型变成nested

第四步:查询爸爸名称等于“王五妈”,预期没有返回结果

nested+must+term

GET my_index/my_type/_search
{"query": {"nested": {"path": "family","score_mode": "sum","query": {"bool": {"must": [{"term": {"family.call.keyword": "dad"}},{"term": {"family.name.keyword": "王五妈"}}]}}}}
}
返回结果:

{"took": 1,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 0,"max_score": null,"hits": []}
}

查询爸爸名称等于“张三爸”

GET my_index/my_type/_search
{"query": {"nested": {"path": "family","score_mode": "sum","query": {"bool": {"must": [{"term": {"family.call.keyword": "dad"}},{"term": {"family.name.keyword": "张三爸"}}]}}}}
}
查询结果为:一条结果 和预期结果一致

{"took": 1,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 1,"max_score": 1.3862944,"hits": [{"_index": "my_index","_type": "my_type","_id": "1","_score": 1.3862944,"_source": {"owner": "张三","family": [{"call": "dad","name": "张三爸"},{"call": "mom","name": "张三妈"}]}}]}
}

第五步:聚合=》所有爸爸名称

nested+filter+term+terms 聚合方式

GET my_index/my_type/_search?size=0
{"aggs": {"家庭": {"nested": {"path": "family"},"aggs": {"爸爸": {"filter": {"term": {"family.call.keyword": "dad"}},"aggs": {"爸爸集合": {"terms": {"field": "family.name.keyword","size": 10}}}}}}}
}
聚合结果如下:

{"took": 1,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 5,"max_score": 0,"hits": []},"aggregations": {"家庭": {"doc_count": 10,"爸爸": {"doc_count": 5,"爸爸集合": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "张三爸","doc_count": 1},{"key": "我老爸","doc_count": 1},{"key": "李四爸","doc_count": 1},{"key": "王五爸","doc_count": 1},{"key": "赵六爸","doc_count": 1}]}}}}
}

聚合所有妈妈的名称:

GET my_index/my_type/_search?size=0
{"aggs": {"家庭": {"nested": {"path": "family"},"aggs": {"妈妈": {"filter": {"term": {"family.call.keyword": "mom"}},"aggs": {"妈妈集合": {"terms": {"field": "family.name.keyword","size": 10}}}}}}}
}

聚合得到所有妈妈名称的结果是;

{"took": 6,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 5,"max_score": 0,"hits": []},"aggregations": {"家庭": {"doc_count": 10,"妈妈": {"doc_count": 5,"妈妈集合": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "张三妈","doc_count": 1},{"key": "我老妈","doc_count": 1},{"key": "李四妈","doc_count": 1},{"key": "王五妈","doc_count": 1},{"key": "赵六妈","doc_count": 1}]}}}}
}

以上得到的结果完全符合预期。

而且也相对简单,尤其是动态mapping模板的建立。


















这篇关于ElasticSearch——nested(mapping,query,aggregation)的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/463825

相关文章

Java操作ElasticSearch的实例详解

《Java操作ElasticSearch的实例详解》Elasticsearch是一个分布式的搜索和分析引擎,广泛用于全文搜索、日志分析等场景,本文将介绍如何在Java应用中使用Elastics... 目录简介环境准备1. 安装 Elasticsearch2. 添加依赖连接 Elasticsearch1. 创

SpringBoot基于MyBatis-Plus实现Lambda Query查询的示例代码

《SpringBoot基于MyBatis-Plus实现LambdaQuery查询的示例代码》MyBatis-Plus是MyBatis的增强工具,简化了数据库操作,并提高了开发效率,它提供了多种查询方... 目录引言基础环境配置依赖配置(Maven)application.yml 配置表结构设计demo_st

基于MySQL Binlog的Elasticsearch数据同步实践

一、为什么要做 随着马蜂窝的逐渐发展,我们的业务数据越来越多,单纯使用 MySQL 已经不能满足我们的数据查询需求,例如对于商品、订单等数据的多维度检索。 使用 Elasticsearch 存储业务数据可以很好的解决我们业务中的搜索需求。而数据进行异构存储后,随之而来的就是数据同步的问题。 二、现有方法及问题 对于数据同步,我们目前的解决方案是建立数据中间表。把需要检索的业务数据,统一放到一张M

ElasticSearch的DSL查询⑤(ES数据聚合、DSL语法数据聚合、RestClient数据聚合)

目录 一、数据聚合 1.1 DSL实现聚合 1.1.1 Bucket聚合  1.1.2 带条件聚合 1.1.3 Metric聚合 1.1.4 总结 2.1 RestClient实现聚合 2.1.1 Bucket聚合 2.1.2 带条件聚合 2.2.3 Metric聚合 一、数据聚合 聚合(aggregations)可以让我们极其方便的实现对数据的统计、分析、运算。例如:

【docker】基于docker-compose 安装elasticsearch + kibana + ik分词器(8.10.4版本)

记录下,使用 docker-compose 安装 Elasticsearch 和 Kibana,并配置 IK 分词器,你可以按照以下步骤进行。此过程适用于 Elasticsearch 和 Kibana 8.10.4 版本。 安装 首先,在你的工作目录下创建一个 docker-compose.yml 文件,用于配置 Elasticsearch 和 Kibana 的服务。 version:

ElasticSearch底层原理简析

1.ElasticSearch简述 ElastiaSearch(以下简称ES)是一个基于Lucene的搜索服务器,它提供了一个分布式多用户能力的全文搜索引擎,支持RESTful web接口。Elasticsearch是用Java开发的,并作为Apache许可条款下的开放源码发布,是当前流行的企业级搜索引擎。ES设计用于云计算中,能够进行实时搜索,支持PB级搜索,具有稳定,可靠,快速,安装使用方便等

ElasticSearch 6.1.1 通过Head插件,新建索引,添加文档,及其查询数据

ElasticSearch 6.1.1 通过Head插件,新建索引,添加文档,及其查询; 一、首先启动相关服务: 二、新建一个film索引: 三、建立映射: 1、通过Head插件: POST http://192.168.1.111:9200/film/_mapping/dongzuo/ {"properties": {"title": {"type":

ElasticSearch 6.1.1运用代码添加索引及其添加,修改,删除文档

1、新建一个MAVEN项目:ElasticSearchTest 2、修改pom.xml文件内容: <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.or

Windows下安装Elasticsearch,启动报错,解决方法,访问

对于Windows用户,我们推荐使用MSI安装包进行安装。这个安装包使用图形用户界面来引导你进行安装。 首先,从这里https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.1.1.msi下载Elasticsearch 6.1.1的MSI安装包。 然后双击下载好的安装包文件启动图形化安装程序,在第一个界面,选

Elasticsearch:无状态世界中的数据安全

作者:来自 Elastic Henning Andersen 在最近的博客文章中,我们宣布了支持 Elastic Cloud Serverless 产品的无状态架构。通过将持久性保证和复制卸载到对象存储(例如 Amazon S3),我们获得了许多优势和简化。 从历史上看,Elasticsearch 依靠本地磁盘持久性来确保数据安全并处理陈旧或孤立的节点。在本博客中,我们将讨论无状态的数据持