本文主要是介绍elasticsearch8的整体总结,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
es概述
elasticsearch简介
官网: https://www.elastic.co/
ElasticSearch是一个基于Lucene(Apache开源全文检索工具包)的搜索服务器。它提供了一个分布式多用户能力的全文搜索引擎,基于RESTful web接口。Elasticsearch是用Java开发的,并作为Apache许可条款下的开放源码发布,是当前流行的企业级搜索引擎。
Elastic官方宣布Elasticsearch进入Version 8,在速度、扩展、高相关性和简单性方面开启了一个全新的时代。
说明:Elasticsearch 8最低jdk版本要求jdk17,当前我们选择Elasticsearch版本:Elasticsearch8.5.0
Elasticsearch的特性
近实时
理论上数据从写入Elasticsearch到数据可以被搜索只需要1秒左右的时间,实现准实时的数据索引和查询。
分布式、可扩展
天生的分布式的设计,数据分片对于应用层透明,扩展性良好,可以轻易的进行节点扩容,支持上百甚至上千的服务器节点,支持PB级别的数据存储和搜索。
稳定可靠
Elasticsearch的分布式、数据冗余特性提供更加可靠的运行机制,且经过大型互联网公司众多项目使用,可靠性得到验证。
高可用
数据多副本、多节点存储,单节点的故障不影响集群的使用。
Rest API
Elasticsearch提供标准的Rest API,这使得所有支持Rest API的语言都能够轻易的使用Elasticsearch,具备多语言通用的支持特性,易于使用。Elasticsearch Version 8以后,去除了以前Transport API、High-Level API、Low-Level API,统一标准的Rest API,这将使得Elasticsearch更加容易使用,原来被诟病的API混乱问题终于得到完美解决。
高性能
Elasticsearch底层构建基于Lucene,具备强大的搜索能力,即便是PB级别的数据依然能够实现秒级的搜索。
多客户端支持
支持Java、Python、Go、PHP、Ruby等多语言客户端,还支持JDBC、ODBC等客户端。
安全支持
提供单点登录SSO、加密通信、集群角色、属性的访问控制,支持审计等功能,在安全层面上还支持集成第三方的安全组件,在Version 8以后,默认开启了HTTPS,大大简化了安全上的配置。
直接支持NLP
Elasticsearch支持NLP,可以实现情感分析、文本分类等功能,在Version 8之前,需要额外的外部组件,而在Version 8,可以直接在Elasticsearch中使用这些功能,无需额外的组件。
原生矢量搜索支持
Elastic 8.0 版引入了一整套原生矢量搜索功能,增加了对近似最近邻 (ANN) 搜索的原生支持,可以快速且大规模地比较基于矢量的查询与基于矢量的文档语料库。
Elasticsearch应用场景
安装详细见后文或者见软件安装文档!
搭建日志系统
ELK套件日志系统应该是Elasticsearch使用最广泛的场景之一了,Elasticsearch支持海量数据的存储和查询,特别适合日志搜索场景。广泛使用的ELK套件(Elasticsearch、Logstash、Kibana)是日志系统最经典的案例,使用Logstash和Beats组件进行日志收集,Elasticsearch存储和查询应用日志,Kibana提供日志的可视化搜索界面。
elasticsearch的倒排索引
倒排索引步骤:
- 数据根据词条进行分词,同时记录文档索引位置
- 将词条相同的数据化进行合并
- 对词条进行排序
搜索过程:
先将搜索词语进行分词,分词后再倒排索引列表查询文档位置(docId)。根据docId查询文档数据。
安装elasticsearch
第一步:拉取镜像
docker pull elasticsearch:8.5.0
第二步:启动
需要在宿主机建立:两个文件夹
rm -rf /opt/elasticsearch
mkdir -p /opt/elasticsearch/{config,plugins,data}
项配置中文分词器直接去github下载拷贝到plugins目录再重启即可
yum install -y unzip
unzip elasticsearch-analysis-ik-8.5.0.zip -d ik-analyzer
rm -rf elasticsearch-analysis-ik-8.5.0.zip
制作配置文件
cat <<EOF> /opt/elasticsearch/config/elasticsearch.yml
xpack.security.enabled: false
xpack.license.self_generated.type: basic
xpack.security.transport.ssl.enabled: false # 不配报错
xpack.security.enrollment.enabled: true
http.host: 0.0.0.0
EOF
授予权限
chmod -R 777 /opt/elasticsearch
执行
docker run --name elasticsearch -p 9200:9200 -p 9300:9300 \
--net elastic \
--restart=always \
-e "discovery.type=single-node" \
-e ES_JAVA_OPTS="-Xms1024m -Xmx1024m" \
-v /opt/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \
-v /opt/elasticsearch/data:/usr/share/elasticsearch/data \
-v /opt/elasticsearch/plugins:/usr/share/elasticsearch/plugins \
-d elasticsearch:8.5.0
如果运行时提示elastic 未找到 执行这个命令 : docker network create elastic
# 重置下面两个密码,注意:需等待es启动
docker exec -it elasticsearch bin/elasticsearch-reset-password -u elastic -i # -i 表示自定义密码 给java客户端用的
docker exec -it elasticsearch bin/elasticsearch-reset-password -u kibana_system -i # 给 kibana 用的用户名: elastic 密码可以使用: 111111
第三步:安装中文分词器
- 下载elasticsearch-analysis-ik-8.5.0.zip
- 上传到/mydata/elasticsearch/plugins 目录后,解压:unzip elasticsearch-analysis-ik-8.5.0.zip -d ik-analyzer
必须删除原来的压缩包elasticsearch-analysis-ik-8.5.0.zip - 重启es:docker restart a24eb9941759
a24eb9941759:表示容器ID 运行时,需要改成自己的容器ID
elasticsearch核心概念
es对比数据库
MySQL:Schema(DDL-建表语句)
ES:Mapping 映射
索引(Index)
一个索引就是一个拥有几分相似特征的文档的集合。比如说,你可以有一个客户数据的索引,另一个产品目录的索引,还有一个订单数据的索引。一个索引由一个名字来标识(必须全部是小写字母),并且当我们要对这个索引中的文档进行索引、搜索、更新和删除的时候,都要使用到这个名字。在一个集群中,可以定义任意多的索引。
能搜索的数据必须索引,这样的好处是可以提高查询速度,比如:新华字典前面的目录就是索引的意思,目录可以提高查询速度。
Elasticsearch索引的精髓:一切设计都是为了提高搜索的性能。
类型(Type)
在一个索引中,你可以定义一种或多种类型。
一个类型是你的索引的一个逻辑上的分类/分区,其语义完全由你来定。通常,会为具有一组共同字段的文档定义一个类型。不同的版本,类型发生了不同的变化
版本 | Type |
---|---|
5.x | 支持多种type |
6.x | 只能有一种type |
7.x | 默认不再支持自定义索引类型(默认类型为:_doc) |
8.x | 默认类型为:_doc |
文档(Document)
一个文档是一个可被索引的基础信息单元,也就是一条数据
比如:你可以拥有某一个客户的文档,某一个产品的一个文档,当然,也可以拥有某个订单的一个文档。文档以JSON(Javascript Object Notation)格式来表示,而JSON是一个到处存在的互联网数据交互格式。
在一个index/type里面,你可以存储任意多的文档。
字段(Field)
相当于是数据表的字段,对文档数据根据不同属性进行的分类标识。
映射(Mapping)
mapping是处理数据的方式和规则方面做一些限制,如:某个字段的数据类型、默认值、分析器、是否被索引等等。这些都是映射里面可以设置的,其它就是处理ES里面数据的一些使用规则设置也叫做映射,按着最优规则处理数据对性能提高很大,因此才需要建立映射,并且需要思考如何建立映射才能对性能更好。
Elasticsearch 基础功能
参考文档:https://www.elastic.co/guide/en/elasticsearch/reference/8.5/elasticsearch-intro.html
分词器
官方提供的分词器有这么几种: Standard、Letter、Lowercase、Whitespace、UAX URL Email、Classic、Thai等,中文分词器可以使用第三方的比如IK分词器。前面我们已经安装过了。
索引操作
ES 软件的索引可以类比为 MySQL 中表的概念,创建一个索引,类似于创建一个表
所有RestFul
风格API不用记忆,知道每个接口作用即可,会查询官方文档:https://www.elastic.co/guide/index.html
- RestFul文档:https://www.elastic.co/guide/en/elastic-stack/8.5/index.html
创建索引
语法: PUT /{索引名称}
PUT /my_index结果:
{"acknowledged" : true,"shards_acknowledged" : true,"index" : "my_index"
}
查看所有索引
GET /_cat/indices?v
查看单个索引
语法: GET /{索引名称}
删除索引
语法: DELETE /{索引名称}
文档操作
文档是 ES 软件搜索数据的最小单位, 不依赖预先定义的模式,所以可以将文档类比为表的一行JSON类型的数据。我们知道关系型数据库中,要提前定义字段才能使用,在Elasticsearch中,对于字段是非常灵活的,有时候我们可以忽略该字段,或者动态的添加一个新的字段。
创建文档
语法:
PUT /{索引名称}/{类型}/{id}{jsonbody}例如:
PUT /my_index/_doc/1
{"title": "小米手机","category": "小米","images": "http://www.gulixueyuan.com/xm.jpg","price": 3999
}
在创建数据时,需要指定唯一性标识,那么请求范式 POST,PUT 都可以
查看文档
语法:GET /{索引名称}/{类型}/{id}
GET /my_index/_doc/1
查询所有文档
语法: GET /{索引名称}/_search
修改文档
PUT /my_index/_doc/1
{"title": "小米手机","category": "小米","images": "http://www.gulixueyuan.com/xm.jpg","price": 4500
}
修改局部属性
注意:这种更新只能使用post方式。
语法: POST /{索引名称}/_update/{docId}
{"doc": {"属性": "值"}
}例:
POST /my_index/_update/1
{"doc": {"price": 4500}
}
删除文档
语法: DELETE /{索引名称}/{类型}/{id}
DELETE /my_index/_doc/1
结果:
{"_index": "my_index","_id": "1","_version": 5,"result": "deleted","_shards": {"total": 2,"successful": 1,"failed": 0},"_seq_no": 6,"_primary_term": 1
}
映射mapping
创建数据库表需要设置字段名称,类型,长度,约束等;索引库也一样,需要知道这个类型下有哪些字段,每个字段有哪些约束信息,这就叫做映射(mapping)。
查看映射
语法: GET /{索引名称}/_mapping
动态映射
在关系数据库中,需要事先创建数据库,然后在该数据库下创建数据表,并创建 表字段、类型、长度、主键等,最后才能基于表插入数据。而Elasticsearch中不 需要定义Mapping映射(即关系型数据库的表、字段等),在文档写入 Elasticsearch时,会根据文档字段自动识别类型,这种机制称之为动态映射。
映射规则对应:
数据 | 对应的类型 |
---|---|
null | 字段不添加 |
true|flase | boolean |
字符串 | text/keyword |
数值 | long |
小数 | float |
日期 | date |
特殊类型:字符串
- text:用于长文本,对本文内容进行分词(产生倒排索引文档列表),支持多关键字全文查询。例如:对电商项目商品名称,或者文章正文设置为text 缺点:不能进行聚合(分组),不支持排序。
- keyword:用于词条精确查询,支持等值查询,不需要进行分词字符串(分词后无意义)。例如:用户昵称、用户手机号、身份证号、图片地址。场景:根据品牌名称等值查询。支持聚合(分组)、排序 不支持多关键词查询
静态映射
静态映射是在Elasticsearch中也可以事先定义好映射,即手动映射,包含文档的各字段类型、分词器等,这称为静态映射。
#删除原创建的索引
DELETE /my_index#创建索引,并同时指定映射关系和分词器等。
PUT /my_index
{"mappings": {"properties": {"title": {"type": "text","index": true, # 是否要索引"store": true, # 是否要存储"analyzer": "ik_max_word", # 当字符串为text类型才需要分词器"search_analyzer": "ik_smart"},"category": { "type": "keyword","index": true,"store": true},"images": {"type": "keyword","index": false,"store": true},"price": {"type": "integer","index": true,"store": true}}}
}结果:
{"acknowledged" : true,"shards_acknowledged" : true,"index" : "my_index"
}
type分类如下:
- 字符串:text(支持分词)和 keyword(不支持分词)。
- text:该类型被用来索引长文本,在创建索引前会将这些文本进行分词,转化为词的组合,建立索引;允许es来检索这些词,text类型不能用来排序和聚合。
- keyword:该类型不能分词,可以被用来检索过滤、排序和聚合,keyword类型不可用text进行分词模糊检索。
- 数值型:long、integer、short、byte、double、float
- 日期型:date
- 布尔型:boolean
nested 介绍
nested:类型是一种特殊的对象object数据类型(specialised version of the object datatype ),允许对象数组彼此独立地进行索引和查询。
demo: 建立一个普通的index
如果linux 中有这个my_comment_index 先删除!DELETE /my_comment_index
步骤1:建立一个索引( 存储博客文章及其所有评论)
PUT my_comment_index/_doc/1
{"title": "狂人日记","body": "《狂人日记》是一篇象征性和寓意很强的小说,当时,鲁迅对中国国民精神的麻木愚昧颇感痛切。","comments": [{"name": "张三","age": 34,"rating": 8,"comment": "非常棒的文章","commented_on": "30 Nov 2023"},{"name": "李四","age": 38,"rating": 9,"comment": "文章非常好","commented_on": "25 Nov 2022"},{"name": "王五","age": 33,"rating": 7,"comment": "手动点赞","commented_on": "20 Nov 2021"}]
}
如上所示,所以我们有一个文档描述了一个帖子和一个包含帖子上所有评论的内部对象评论。
但是Elasticsearch搜索中的内部对象并不像我们期望的那样工作。
步骤2 : 执行查询
GET /my_comment_index/_search
{"query": {"bool": {"must": [{"match": {"comments.name": "李四"}},{"match": {"comments.age": 34}}]}}
}
查询结果:居然正常的响应结果了。
原因分析:comments字段默认的数据类型是Object,故我们的文档内部存储为:
{
“title”: [ 狂人日记],
“body”: [ 《狂人日记》是一篇象征性和寓意很强的小说,当时… ],
“comments.name”: [ 张三, 李四, 王五 ],
“comments.comment”: [ 非常棒的文章,文章非常好,王五,… ],
“comments.age”: [ 33, 34, 38 ],
“comments.rating”: [ 7, 8, 9 ]
}
我们可以清楚地看到,comments.name和comments.age之间的关系已丢失。这就是为什么我们的文档匹配李四和34的查询。
说白了就是一组数据是不可分割的,不能单独去被查找然后拼凑。
步骤3:删除当前索引
DELETE /my_comment_index
步骤4:建立一个nested 类型的(comments字段映射为nested类型,而不是默认的object类型)
PUT my_comment_index
{"mappings": {"properties": {"comments": {"type": "nested" }}}
}PUT my_comment_index/_doc/1
{"title": "狂人日记","body": "《狂人日记》是一篇象征性和寓意很强的小说,当时,鲁迅对中国国民精神的麻木愚昧颇感痛切。","comments": [{"name": "张三","age": 34,"rating": 8,"comment": "非常棒的文章","commented_on": "30 Nov 2023"},{"name": "李四","age": 38,"rating": 9,"comment": "文章非常好","commented_on": "25 Nov 2022"},{"name": "王五","age": 33,"rating": 7,"comment": "手动点赞","commented_on": "20 Nov 2021"}]
}
重新执行步骤1,使用nested 查询
GET /my_comment_index/_search
{"query": {"nested": {"path": "comments","query": {"bool": {"must": [{"match": {"comments.name": "李四"}},{"match": {"comments.age": 34}}]}}}}
}
结果发现没有返回任何的文档,这是何故?
当将字段设置为nested 嵌套对象将数组中的每个对象索引为单独的隐藏文档,这意味着可以独立于其他对象查询每个嵌套对象。文档的内部表示:
{
{
“comments.name”: [ 张三],
“comments.comment”: [ 非常棒的文章 ],
“comments.age”: [ 34 ],
“comments.rating”: [ 9 ]
},
{
“comments.name”: [ 李四],
“comments.comment”: [ 文章非常好 ],
“comments.age”: [ 38 ],
“comments.rating”: [ 8 ]
},
{
“comments.name”: [ 王五],
“comments.comment”: [手动点赞],
“comments.age”: [ 33 ],
“comments.rating”: [ 7 ]
},
{
“title”: [ 狂人日记 ],
“body”: [ 《狂人日记》是一篇象征性和寓意很强的小说,当时,鲁迅对中国… ]
}
}
每个内部对象都在内部存储为单独的隐藏文档。 这保持了他们的领域之间的关系。
DSL高级查询
DSL概述
Query DSL概述: Domain Specific Language(领域专用语言),Elasticsearch提供了基于JSON的DSL来定义查询。
创建索引库设置好映射:
#创建索引,并同时指定映射关系和分词器等。
PUT /my_index
{"mappings": {"properties": {"title": {"type": "text","index": true,"store": true,"analyzer": "ik_max_word","search_analyzer": "ik_smart"},"category": { "type": "keyword","index": true,"store": true},"images": {"type": "keyword","index": false,"store": true},"price": {"type": "integer","index": true,"store": true}}}
}
准备数据:
PUT /my_index/_doc/1
{"id":1,"title":"华为笔记本电脑","category":"华为","images":"http://www.gulixueyuan.com/xm.jpg","price":5388}PUT /my_index/_doc/2
{"id":2,"title":"华为手机","category":"华为","images":"http://www.gulixueyuan.com/xm.jpg","price":5500}PUT /my_index/_doc/3
{"id":3,"title":"VIVO手机","category":"vivo","images":"http://www.gulixueyuan.com/xm.jpg","price":3600}
DSL查询
查询所有文档
match_all:
POST /my_index/_search
{"query": {"match_all": {}}
}结果:
{"took" : 0,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 3,"relation" : "eq"},"max_score" : 1.0,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "1","_score" : 1.0,"_source" : {"id" : 1,"title" : "华为笔记本电脑","category" : "华为","images" : "http://www.gulixueyuan.com/xm.jpg","price" : 5388}},{"_index" : "my_index","_type" : "_doc","_id" : "2","_score" : 1.0,"_source" : {"id" : 2,"title" : "华为手机","category" : "华为","images" : "http://www.gulixueyuan.com/xm.jpg","price" : 5500}},{"_index" : "my_index","_type" : "_doc","_id" : "3","_score" : 1.0,"_source" : {"id" : 3,"title" : "VIVO手机","category" : "vivo","images" : "http://www.gulixueyuan.com/xm.jpg","price" : 3600}}]}
}
匹配查询(match)
match:
POST /my_index/_search
{"query": {"match": {"title": "华为智能手机"}}
}结果:
{"took" : 3,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 2,"relation" : "eq"},"max_score" : 0.5619608,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "2","_score" : 0.5619608,"_source" : {"id" : 2,"title" : "华为手机","category" : "华为","images" : "http://www.gulixueyuan.com/xm.jpg","price" : 5500}},{"_index" : "my_index","_type" : "_doc","_id" : "1","_score" : 0.35411233,"_source" : {"id" : 1,"title" : "华为笔记本电脑","category" : "华为","images" : "http://www.gulixueyuan.com/xm.jpg","price" : 5388}}]}
}
多字段匹配
POST /my_index/_search
{"query": {"multi_match": {"query": "华为智能手机","fields": ["title","category"]}}
}结果:
{"took" : 3,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 2,"relation" : "eq"},"max_score" : 0.5619608,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "2","_score" : 0.5619608,"_source" : {"id" : 2,"title" : "华为手机","category" : "华为","images" : "http://www.gulixueyuan.com/xm.jpg","price" : 5500}},{"_index" : "my_index","_type" : "_doc","_id" : "1","_score" : 0.35411233,"_source" : {"id" : 1,"title" : "华为笔记本电脑","category" : "华为","images" : "http://www.gulixueyuan.com/xm.jpg","price" : 5388}}]}
}
关键字精确查询
term:关键字不会进行分词。
POST /my_index/_search
{"query": {"term": {"title": {"value": "华为手机"}}}
}结果:
{"took" : 0,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 0,"relation" : "eq"},"max_score" : null,"hits" : [ ]}
}
多关键字精确查询
POST /my_index/_search
{"query": {"terms": {"title": ["华为手机","华为"]}}
}结果:
{"took" : 0,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 2,"relation" : "eq"},"max_score" : 1.0,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "1","_score" : 1.0,"_source" : {"id" : 1,"title" : "华为笔记本电脑","category" : "华为","images" : "http://www.gulixueyuan.com/xm.jpg","price" : 5388}},{"_index" : "my_index","_type" : "_doc","_id" : "2","_score" : 1.0,"_source" : {"id" : 2,"title" : "华为手机","category" : "华为","images" : "http://www.gulixueyuan.com/xm.jpg","price" : 5500}}]}
}
范围查询
范围查询使用range。
- gte: 大于等于
- lte: 小于等于
- gt: 大于
- lt: 小于
POST /my_index/_search
{"query": {"range": {"price": {"gte": 3000,"lte": 5000}}}
}
结果:
{"took" : 0,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 1,"relation" : "eq"},"max_score" : 1.0,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "3","_score" : 1.0,"_source" : {"title" : "VIVO手机","category" : "vivo"}}]}
}
指定返回字段
query同级增加_source进行过滤。
POST /my_index/_search
{"query": {"terms": {"title": ["华为手机","华为"]}},"_source": ["title","category"]
}
组合查询
bool 各条件之间有and,or或not的关系
- must: 各个条件都必须满足,所有条件是and的关系
- should: 各个条件有一个满足即可,即各条件是or的关系
- must_not: 不满足所有条件,即各条件是not的关系
- filter: 与must效果等同,但是它不计算得分,效率更高点。
①must
POST /my_index/_search
{"query": {"bool": {"must": [{"match": {"title": "华为"}},{"range": {"price": {"gte": 3000,"lte": 5400}}}]}}
}
结果:
{"took": 1,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": {"value": 1,"relation": "eq"},"max_score": 1.2923405,"hits": [{"_index": "my_index","_id": "1","_score": 1.2923405,"_source": {"id": 1,"title": "华为笔记本电脑","category": "华为","images": "http://www.gulixueyuan.com/xm.jpg","price": 5388}}]}
}
②should
POST /my_index/_search
{"query": {"bool": {"should": [{"match": {"title": "华为"}},{"range": {"price": {"gte": 3000,"lte": 5000}}}]}}
}结果:
{"took" : 0,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 3,"relation" : "eq"},"max_score" : 1.0,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "3","_score" : 1.0,"_source" : {"id" : 3,"title" : "VIVO手机","category" : "vivo","images" : "http://www.gulixueyuan.com/xm.jpg","price" : 3600}},{"_index" : "my_index","_type" : "_doc","_id" : "2","_score" : 0.5619608,"_source" : {"id" : 2,"title" : "华为手机","category" : "华为","images" : "http://www.gulixueyuan.com/xm.jpg","price" : 5500}},{"_index" : "my_index","_type" : "_doc","_id" : "1","_score" : 0.35411233,"_source" : {"id" : 1,"title" : "华为笔记本电脑","category" : "华为","images" : "http://www.gulixueyuan.com/xm.jpg","price" : 5388}}]}
}
如果should和must同时存在,他们之间是and关系:
POST /my_index/_search
{"query": {"bool": {"should": [{"match": {"title": "华为"}},{"range": {"price": {"gte": 3000,"lte": 5000}}}],"must": [{"match": {"title": "华为"}},{"range": {"price": {"gte": 3000,"lte": 5000}}}]}}
}结果:
{"took" : 1,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 0,"relation" : "eq"},"max_score" : null,"hits" : [ ]}
}
③must_not
POST /my_index/_search
{"query": {"bool": {"must_not": [{"match": {"title": "华为"}},{"range": {"price": {"gte": 3000,"lte": 5000}}}]}}
}
结果:
{"took" : 0,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 0,"relation" : "eq"},"max_score" : null,"hits" : [ ]}
}
④filter
_score的分值为0
POST /my_index/_search
{"query": {"bool": {"filter": [{"match": {"title": "华为"}}]}}
}结果:
{"took" : 1,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 2,"relation" : "eq"},"max_score" : 0.0,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "1","_score" : 0.0,"_source" : {"id" : 1,"title" : "华为笔记本电脑","category" : "华为","images" : "http://www.gulixueyuan.com/xm.jpg","price" : 5388}},{"_index" : "my_index","_type" : "_doc","_id" : "2","_score" : 0.0,"_source" : {"id" : 2,"title" : "华为手机","category" : "华为","images" : "http://www.gulixueyuan.com/xm.jpg","price" : 5500}}]}
}
模糊匹配
Fuzzy是相似度查询
GET /my_index/_search
{"query":{"fuzzy":{"title":"viwo"}}
}
模糊查询有最大限制,比如vivo写成viwo还可以查出vivo,但是写成wiwo就查不到了!
聚合查询
select course_id,avg(score) from tab where … group by course_id
聚合允许使用者对es文档进行统计分析,类似与关系型数据库中的group by,当然还有很多其他的聚合,例如取最大值、平均值等等。
聚合三要素:聚合名称(给不同聚合业务其名称-用于解析结果)、聚合字段(对哪个字段进行分组)、聚合类型(如何聚合-常见:字段值相同放在一组)
max
POST /my_index/_search
{"query": {"match_all": {}},"size": 0, "aggs": {"max_price": {"max": {"field": "price"}}}
}结果:
{"took" : 1,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 3,"relation" : "eq"},"max_score" : null,"hits" : [ ]},"aggregations" : {"max_price" : {"value" : 5500.0}}
}
min
POST /my_index/_search
{"query": {"match_all": {}},"size": 0, "aggs": {"min_price": {"min": {"field": "price"}}}
}结果:
{"took" : 12,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 3,"relation" : "eq"},"max_score" : null,"hits" : [ ]},"aggregations" : {"max_price" : {"value" : 3600.0}}
}
avg
POST /my_index/_search
{"query": {"match_all": {}},"size": 0, "aggs": {"avg_price": {"avg": {"field": "price"}}}
}
结果:
{"took" : 12,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 3,"relation" : "eq"},"max_score" : null,"hits" : [ ]},"aggregations" : {"avg_price" : {"value" : 4829.333333333333}}
}
sum
POST /my_index/_search
{"query": {"match_all": {}},"size": 0, "aggs": {"sum_price": {"sum": {"field": "price"}}}
}
结果:
{"took" : 3,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 3,"relation" : "eq"},"max_score" : null,"hits" : [ ]},"aggregations" : {"sum_price" : {"value" : 14488.0}}
}
stats
POST /my_index/_search
{"query": {"match_all": {}},"size": 0, "aggs": {"stats_price": {"stats": {"field": "price"}}}
}
结果:
{"took" : 20,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 3,"relation" : "eq"},"max_score" : null,"hits" : [ ]},"aggregations" : {"stats_price" : {"count" : 3,"min" : 3600.0,"max" : 5500.0,"avg" : 4829.333333333333,"sum" : 14488.0}}
}
terms
桶聚合相当于sql中的group by语句
POST /my_index/_search
{"query": {"match_all": {}},"size": 0, "aggs": {"groupby_category": {"terms": {"field": "category","size": 10}}}
}
结果:
{"took" : 16,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 3,"relation" : "eq"},"max_score" : null,"hits" : [ ]},"aggregations" : {"groupby_category" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 0,"buckets" : [{"key" : "华为","doc_count" : 2},{"key" : "vivo","doc_count" : 1}]}}
}
还可以对桶继续下钻:
POST /my_index/_search
{"query": {"match_all": {}},"size": 0, "aggs": {"groupby_category": {"terms": {"field": "category","size": 10},"aggs": {"avg_price": {"avg": {"field": "price"}}}}}
}
结果:
{"took" : 2,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 3,"relation" : "eq"},"max_score" : null,"hits" : [ ]},"aggregations" : {"groupby_category" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 0,"buckets" : [{"key" : "华为","doc_count" : 2,"avg_price" : {"value" : 5444.0}},{"key" : "vivo","doc_count" : 1,"avg_price" : {"value" : 3600.0}}]}}
}
排序
POST /my_index/_search
{"query": {"bool": {"must": [{"match": {"title": "华为"}}]}},"sort": [{"price": {"order": "asc"}},{"_score": {"order": "desc"}}]
}
结果:
{"took" : 0,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 2,"relation" : "eq"},"max_score" : null,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "1","_score" : 0.35411233,"_source" : {"id" : 1,"title" : "华为笔记本电脑","category" : "华为","images" : "http://www.gulixueyuan.com/xm.jpg","price" : 5388},"sort" : [5388,0.35411233]},{"_index" : "my_index","_type" : "_doc","_id" : "2","_score" : 0.5619608,"_source" : {"id" : 2,"title" : "华为手机","category" : "华为","images" : "http://www.gulixueyuan.com/xm.jpg","price" : 5500},"sort" : [5500,0.5619608]}]}
}
分页查询
分页的两个关键属性:from、size。
- from: 当前页的起始索引,默认从0开始。 from = (pageNum - 1) * size
- size: 每页显示多少条
POST /my_index/_search
{"query": {"match_all": {}},"from": 0,"size": 2
}
结果:
{"took" : 3,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 3,"relation" : "eq"},"max_score" : 1.0,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "1","_score" : 1.0,"_source" : {"id" : 1,"title" : "华为笔记本电脑","category" : "华为","images" : "http://www.gulixueyuan.com/xm.jpg","price" : 5388}},{"_index" : "my_index","_type" : "_doc","_id" : "2","_score" : 1.0,"_source" : {"id" : 2,"title" : "华为手机","category" : "华为","images" : "http://www.gulixueyuan.com/xm.jpg","price" : 5500}}]}
}
注:最多返回前10000条记录
高亮
高亮三要素:
- 高亮字段
- 高亮前置标签(HTML标签)
- 高亮后置标签(HTML标签)
根据关键词查询商品
#高亮 必须要求用户录入关键字
GET my_index/_search
{"query": {"match": {"title": "华为 手机"}},"highlight": {"fields": {"title": {}},"pre_tags": "<font style='color:red'>","post_tags": "</font>"}
}
结果:
{"took": 96,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": {"value": 4,"relation": "eq"},"max_score": 1.2155836,"hits": [{"_index": "my_index","_id": "2","_score": 1.2155836,"_source": {"id": 2,"title": "华为手机","category": "华为","images": "http://www.gulixueyuan.com/xm.jpg","price": 5500},"highlight": {"title": ["<font style='color:red'>华为</font><font style='color:red'>手机</font>"]}},{"_index": "my_index","_id": "1","_score": 0.49191093,"_source": {"id": 1,"title": "华为笔记本电脑","category": "华为","images": "http://www.gulixueyuan.com/xm.jpg","price": 5388},"highlight": {"title": ["<font style='color:red'>华为</font>笔记本电脑"]}},{"_index": "my_index","_id": "3","_score": 0.41299206,"_source": {"id": 3,"title": "VIVO手机","category": "vivo","images": "http://www.gulixueyuan.com/xm.jpg","price": 3600},"highlight": {"title": ["VIVO<font style='color:red'>手机</font>"]}},{"_index": "my_index","_id": "4","_score": 0.41299206,"_source": {"id": 3,"title": "OPPO手机","category": "oppo","images": "http://www.gulixueyuan.com/xm.jpg","price": 5500},"highlight": {"title": ["OPPO<font style='color:red'>手机</font>"]}}]}
}
Elasticsearch Java API Client
官方文档:https://www.elastic.co/guide/en/elasticsearch/client/java-api-client/8.5/installation.html
说明:8版本的es和7版本的es不一样,需要专门声明一个客户端方法
搭建项目
1、创建项目:elasticsearch_demo
2、导入pom.xml:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd"><modelVersion>4.0.0</modelVersion><parent><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-parent</artifactId><version>3.0.5</version><relativePath/> <!-- lookup parent from repository --></parent><groupId>com.atguigu</groupId><artifactId>elasticsearch_demo</artifactId><version>0.0.1-SNAPSHOT</version><properties><java.version>17</java.version></properties><dependencies><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-web</artifactId></dependency><dependency><groupId>co.elastic.clients</groupId><artifactId>elasticsearch-java</artifactId><version>8.5.3</version></dependency><dependency><groupId>com.fasterxml.jackson.core</groupId><artifactId>jackson-databind</artifactId><version>2.12.3</version></dependency><dependency><groupId>jakarta.json</groupId><artifactId>jakarta.json-api</artifactId><version>2.0.1</version></dependency><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-test</artifactId><scope>test</scope></dependency></dependencies><build><plugins><plugin><groupId>org.springframework.boot</groupId><artifactId>spring-boot-maven-plugin</artifactId></plugin></plugins></build></project>
配置连接
在启动类配置es连接
package com.atguigu.elasticsearch_demo;import co.elastic.clients.elasticsearch.ElasticsearchClient;
import co.elastic.clients.json.jackson.JacksonJsonpMapper;
import co.elastic.clients.transport.ElasticsearchTransport;
import co.elastic.clients.transport.rest_client.RestClientTransport;
import org.apache.http.Header;
import org.apache.http.HttpHost;
import org.apache.http.auth.AuthScope;
import org.apache.http.auth.UsernamePasswordCredentials;
import org.apache.http.impl.client.BasicCredentialsProvider;
import org.apache.http.message.BasicHeader;
import org.elasticsearch.client.RestClient;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.annotation.Bean;@SpringBootApplication
public class ElasticsearchDemoApplication {public static void main(String[] args) {SpringApplication.run(ElasticsearchDemoApplication.class, args);}@Beanpublic ElasticsearchClient buildElasticsearchClient() {BasicCredentialsProvider credsProv = new BasicCredentialsProvider();credsProv.setCredentials(AuthScope.ANY, new UsernamePasswordCredentials("elastic", "111111"));RestClient restClient = RestClient.builder(HttpHost.create("http://192.168.200.6:9200")).setHttpClientConfigCallback(hc -> hc.setDefaultCredentialsProvider(credsProv)).build();// Create the transport with a Jackson mapperElasticsearchTransport transport = new RestClientTransport(restClient, new JacksonJsonpMapper());// And create the API clientElasticsearchClient esClient = new ElasticsearchClient(transport);return esClient;}
}
测试查询
package com.atguigu.elasticsearch_demo;import co.elastic.clients.elasticsearch.ElasticsearchClient;
import co.elastic.clients.elasticsearch.core.SearchRequest;
import co.elastic.clients.elasticsearch.core.SearchResponse;
import co.elastic.clients.elasticsearch.core.search.Hit;
import com.atguigu.elasticsearch_demo.model.Goods;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;import java.io.IOException;
import java.util.List;import static org.junit.jupiter.api.Assertions.*;@SpringBootTest
class ElasticsearchDemoApplicationTest {@Autowiredprivate ElasticsearchClient elasticsearchClient;/*** RestFul GET my_index/_search*/@Testpublic void testDoc() throws IOException {//1.构建用于查询请求构建器对象SearchRequest.Builder builder = new SearchRequest.Builder();builder.index("my_index");//2.执行检索SearchRequest searchRequest = builder.build();System.out.println(searchRequest.toString());SearchResponse<Goods> response = elasticsearchClient.search(searchRequest, Goods.class);//3.解析响应数据List<Hit<Goods>> hits = response.hits().hits();for (Hit<Goods> hit : hits) {Goods goods = hit.source();System.out.println(goods);}}@Testpublic void testDocLambda() throws IOException {//采用Lambda表达式写法实现检索SearchResponse<Goods> response = elasticsearchClient.search(s -> s.index("my_index"), Goods.class);//3.解析响应数据List<Hit<Goods>> hits = response.hits().hits();for (Hit<Goods> hit : hits) {Goods goods = hit.source();System.out.println(goods);}}
}
打印结果:
{"took": 0,"timed_out": false,"_shards": {"failed": 0.0,"successful": 1.0,"total": 1.0,"skipped": 0.0},"hits": {"total": {"relation": "eq","value": 2},"hits": [{"_index": "my_index","_id": "2","_score": 0.49216813,"_source": "{id=2, title=华为手机, category=华为, images=http://www.gulixueyuan.com/xm.jpg, price=5500}"},{"_index": "my_index","_id": "1","_score": 0.29234046,"_source": "{id=1, title=华为笔记本电脑, category=华为, images=http://www.gulixueyuan.com/xm.jpg, price=5388}"}],"max_score": 0.49216813}
}
Spring Data Elasticsearch
官方文档:https://spring.io/projects/spring-data-elasticsearch
Spring Data是一个用于简化数据库、非关系型数据库、索引库访问,并支持云服务的开源框架。其主要目标是使得对数据的访问变得方便快捷。 Spring Data可以极大的简化JPA(Elasticsearch…)的写法,可以在几乎不用写实现的情况下,实现对数据的访问和操作。除了CRUD外,还包括如分页、排序等一些常用的功能。
Spring Data Elasticsearch 基于 spring data API 简化 Elasticsearch操作,将原始操作Elasticsearch的客户端API 进行封装 。Spring Data为Elasticsearch项目提供集成搜索引擎。Spring Data Elasticsearch POJO的关键功能区域为中心的模型与Elastichsearch交互文档和轻松地编写一个存储索引库数据访问层。
搭建项目
1、创建项目:elasticsearch_demo_springdata_es
2、导入pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd"><modelVersion>4.0.0</modelVersion><parent><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-parent</artifactId><version>3.0.5</version><relativePath/> <!-- lookup parent from repository --></parent><groupId>com.atguigu</groupId><artifactId>elasticsearch_demo_springdata_es</artifactId><version>0.0.1-SNAPSHOT</version><properties><java.version>17</java.version></properties><dependencies><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-web</artifactId></dependency><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-data-elasticsearch</artifactId></dependency><dependency><groupId>org.projectlombok</groupId><artifactId>lombok</artifactId></dependency><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-test</artifactId><scope>test</scope></dependency></dependencies><build><plugins><plugin><groupId>org.springframework.boot</groupId><artifactId>spring-boot-maven-plugin</artifactId></plugin></plugins></build></project>
3、添加配置文件
application.yml
spring:elasticsearch:uris: http://192.168.200.6:9200username: elasticpassword: 111111
document映射
package com.atguigu.springdata.model;import lombok.Data;
import org.springframework.data.annotation.Id;
import org.springframework.data.elasticsearch.annotations.Document;
import org.springframework.data.elasticsearch.annotations.Field;
import org.springframework.data.elasticsearch.annotations.FieldType;/*** @author: atguigu* @create: 2023-10-21 16:42*/
@Data
@Document(indexName = "product")
public class Product {@Id@Field(type = FieldType.Long)private Long id;@Field(type = FieldType.Text, analyzer = "ik_max_word")private String name;@Field(type = FieldType.Keyword, index = false)private String image;@Field(type = FieldType.Integer)private Integer price;}
映射
Spring Data通过注解来声明字段的映射属性,有下面的三个注解:
@Document 作用在类,标记实体类为文档对象, indexName:对应索引库名称
@Id 作用在成员变量,标记一个字段作为id主键
@Field 作用在成员变量,标记为文档的字段,并指定字段映射属性:
type:字段类型,取值是枚举:FieldType
index:是否索引,布尔类型,默认是true
store:是否存储,布尔类型,默认是false
analyzer:分词器名称:ik_max_word
package com.atguigu.springdata.repository;import com.atguigu.springdata.model.Product;
import org.springframework.data.elasticsearch.repository.ElasticsearchRepository;/**** springData-ES提供持久层接口 用于操作索引库文档*/
public interface ProductRepository extends ElasticsearchRepository<Product, Long> {//对应索引库文档CURD
}
启动项目,自动新增索引库
测试查询
引入spring-boot-starter-data-elasticsearch后,添加配置文件,springboot会自动配置es连接
package com.atguigu.elasticsearch_demo_es;import co.elastic.clients.elasticsearch.ElasticsearchClient;
import co.elastic.clients.elasticsearch.core.SearchResponse;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;import java.io.IOException;@SpringBootTest
class ElasticsearchDemoEsApplicationTests {@Autowiredprivate ElasticsearchClient elasticsearchClient;@Testvoid contextLoads() throws IOException {SearchResponse<Object> search = elasticsearchClient.search(s ->s.index("my_index").query(q -> q.match(m -> m.field("title").query("华为"))),Object.class);System.out.println(search);}}
持久层接口测试
package com.atguigu.springdata;import co.elastic.clients.elasticsearch.ElasticsearchClient;
import com.atguigu.springdata.model.Product;
import com.atguigu.springdata.repository.ProductRepository;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;import java.io.IOException;
import java.util.Optional;@SpringBootTest
class SpringDataESDemoTestByRepository {@Autowiredprivate ElasticsearchClient elasticsearchClient;@Autowiredprivate ProductRepository productRepository;@Testpublic void test() {//Product product = new Product();//product.setId(2L);//product.setName("小米2");//productRepository.save(product);//productRepository.deleteById(2L);Optional<Product> optional = productRepository.findById(3L);Product product = optional.get();System.out.println(product);}}
总结:简单查询与创建索引及映射推荐使用Spring Data Elasticsearch Api;复杂查询使用Elasticsearch Java API Client,结合使用,方便开发
这篇关于elasticsearch8的整体总结的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!