Solr通过edismax提升评分并打印评分规则

2024-02-16 01:48

本文主要是介绍Solr通过edismax提升评分并打印评分规则,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

首先看一下DisMax query parser的定义:

The DisMax query parser is designed to process simple phrases (without complex syntax) entered by users and to search for individual terms across several fields using different weighting (boosts) based on the significance of each field. Additional options enable users to influence the score based on rules specific to each use case

(independent of user input).

再看eDisMax(The Extended DisMax Query Parser)的定义:

The Extended DisMax (eDisMax) query parser is an improved version of the DisMax query parser,includes improved boost function: in Extended DisMax, the boost function is a multiplier rather than an addend, improving your boost results; the additive boost functions of DisMax (bf and bq) are also supported.

In addition to all the DisMax parameters, Extended DisMax includes these query parameters:
【The boost Parameter 】

A multivalued list of strings parsed as queries with scores multiplied by the score from the main query for all matching documents. This parameter is shorthand for wrapping the query produced by eDisMax using the BoostQParserPlugin

即通过boost参数可以在原有的评分基础上再乘以这个参数,该参数可以为某个field。


比如从Mysql中向solr导入以下数据:

+----+--------------------------------+--------+
| id | keyword                               | weight |
+----+--------------------------------+--------+
|  3 | 中国                                          |    1.0 |
|  4 | 美国人民                                  |    1.0 |
|  5 | 人民群众                                  |    1.0 |
|  6 | 美国人民                                  |    1.0 |
|  7 | 中国人民解放军                      |    2.0 |
|  8 | 中国很好,美国也不错          |   10.0 |
|  9 | chinese people                      |    1.0 |
| 10 | my god, you are chinese     |    1.0 |
| 11 | you are chinese people       |    1.0 |
| 12 | 中国中国                                 |    1.0 |
+----+--------------------------------+--------+

在执行查询时,可以通过设置debugQuery来打印评分规则(可以在Raw Query Parameters中设置debugQuery=true或者直接勾选debugQuery如下图所示),

例如,不进行boost提分,直接查询关键词:



返回结果中的评分详情:

"debug": {
    "rawquerystring": "keyword:中国",
    "querystring": "keyword:中国",
    "parsedquery": "(keyword:中国 keyword:china)/no_coord",
    "parsedquery_toString": "keyword:中国 keyword:china",
    "explain": {
      "3": "\n0.7724356 = sum of:\n  0.7724356 = weight(keyword:中国 in 0) [ClassicSimilarity], result of:\n    0.7724356 = score(doc=0,freq=1.0), product of:\n      0.4562129 = queryWeight, product of:\n        1.6931472 = idf(docFreq=4, maxDocs=10)\n        0.2694467 = queryNorm\n      1.6931472 = fieldWeight in 0, product of:\n        1.0 = tf(freq=1.0), with freq of:\n          1.0 = termFreq=1.0\n        1.6931472 = idf(docFreq=4, maxDocs=10)\n        1.0 = fieldNorm(doc=0)\n",
      "7": "\n0.24138615 = sum of:\n  0.24138615 = weight(keyword:中国 in 4) [ClassicSimilarity], result of:\n    0.24138615 = score(doc=4,freq=1.0), product of:\n      0.4562129 = queryWeight, product of:\n        1.6931472 = idf(docFreq=4, maxDocs=10)\n        0.2694467 = queryNorm\n      0.5291085 = fieldWeight in 4, product of:\n        1.0 = tf(freq=1.0), with freq of:\n          1.0 = termFreq=1.0\n        1.6931472 = idf(docFreq=4, maxDocs=10)\n        0.3125 = fieldNorm(doc=4)\n",
      "8": "\n0.3862178 = sum of:\n  0.3862178 = weight(keyword:中国 in 5) [ClassicSimilarity], result of:\n    0.3862178 = score(doc=5,freq=1.0), product of:\n      0.4562129 = queryWeight, product of:\n        1.6931472 = idf(docFreq=4, maxDocs=10)\n        0.2694467 = queryNorm\n      0.8465736 = fieldWeight in 5, product of:\n        1.0 = tf(freq=1.0), with freq of:\n          1.0 = termFreq=1.0\n        1.6931472 = idf(docFreq=4, maxDocs=10)\n        0.5 = fieldNorm(doc=5)\n",
      "12": "\n0.54619443 = sum of:\n  0.54619443 = weight(keyword:中国 in 9) [ClassicSimilarity], result of:\n    0.54619443 = score(doc=9,freq=2.0), product of:\n      0.4562129 = queryWeight, product of:\n        1.6931472 = idf(docFreq=4, maxDocs=10)\n        0.2694467 = queryNorm\n      1.1972358 = fieldWeight in 9, product of:\n        1.4142135 = tf(freq=2.0), with freq of:\n          2.0 = termFreq=2.0\n        1.6931472 = idf(docFreq=4, maxDocs=10)\n        0.5 = fieldNorm(doc=9)\n"
    },


当设置 edismax query方式以及boost参数以后(本例中用weight 列作为要提分的权重,lucene的原始评分乘以这个权重为最终得分),如:



评分详情:

"debug": {
    "rawquerystring": "keyword:中国",
    "querystring": "keyword:中国",
    "parsedquery": "BoostedQuery(boost(+(keyword:中国 keyword:china),float(weight)))",
    "parsedquery_toString": "boost(+(keyword:中国 keyword:china),float(weight))",
    "explain": {
      "3": "\n0.7724356 = boost(keyword:中国 keyword:china,float(weight)), product of:\n  0.7724356 = sum of:\n    0.7724356 = weight(keyword:中国 in 0) [ClassicSimilarity], result of:\n      0.7724356 = score(doc=0,freq=1.0), product of:\n        0.4562129 = queryWeight, product of:\n          1.6931472 = idf(docFreq=4, maxDocs=10)\n          0.2694467 = queryNorm\n        1.6931472 = fieldWeight in 0, product of:\n          1.0 = tf(freq=1.0), with freq of:\n            1.0 = termFreq=1.0\n          1.6931472 = idf(docFreq=4, maxDocs=10)\n          1.0 = fieldNorm(doc=0)\n  1.0 = float(weight)=1.0\n",
      "7": "\n0.4827723 = boost(keyword:中国 keyword:china,float(weight)), product of:\n  0.24138615 = sum of:\n    0.24138615 = weight(keyword:中国 in 4) [ClassicSimilarity], result of:\n      0.24138615 = score(doc=4,freq=1.0), product of:\n        0.4562129 = queryWeight, product of:\n          1.6931472 = idf(docFreq=4, maxDocs=10)\n          0.2694467 = queryNorm\n        0.5291085 = fieldWeight in 4, product of:\n          1.0 = tf(freq=1.0), with freq of:\n            1.0 = termFreq=1.0\n          1.6931472 = idf(docFreq=4, maxDocs=10)\n          0.3125 = fieldNorm(doc=4)\n  2.0 = float(weight)=2.0\n",
      "8": "\n3.862178 = boost(keyword:中国 keyword:china,float(weight)), product of:\n  0.3862178 = sum of:\n    0.3862178 = weight(keyword:中国 in 5) [ClassicSimilarity], result of:\n      0.3862178 = score(doc=5,freq=1.0), product of:\n        0.4562129 = queryWeight, product of:\n          1.6931472 = idf(docFreq=4, maxDocs=10)\n          0.2694467 = queryNorm\n        0.8465736 = fieldWeight in 5, product of:\n          1.0 = tf(freq=1.0), with freq of:\n            1.0 = termFreq=1.0\n          1.6931472 = idf(docFreq=4, maxDocs=10)\n          0.5 = fieldNorm(doc=5)\n  10.0 = float(weight)=10.0\n",
      "12": "\n0.54619443 = boost(keyword:中国 keyword:china,float(weight)), product of:\n  0.54619443 = sum of:\n    0.54619443 = weight(keyword:中国 in 9) [ClassicSimilarity], result of:\n      0.54619443 = score(doc=9,freq=2.0), product of:\n        0.4562129 = queryWeight, product of:\n          1.6931472 = idf(docFreq=4, maxDocs=10)\n          0.2694467 = queryNorm\n        1.1972358 = fieldWeight in 9, product of:\n          1.4142135 = tf(freq=2.0), with freq of:\n            2.0 = termFreq=2.0\n          1.6931472 = idf(docFreq=4, maxDocs=10)\n          0.5 = fieldNorm(doc=9)\n  1.0 = float(weight)=1.0\n"
    },


可以看到id为7的记录其weight为2.0, 评分提升了两倍,id为8的记录其weight为10.0, 评分提升了10倍.

这篇关于Solr通过edismax提升评分并打印评分规则的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/713154

相关文章

PowerShell中15个提升运维效率关键命令实战指南

《PowerShell中15个提升运维效率关键命令实战指南》作为网络安全专业人员的必备技能,PowerShell在系统管理、日志分析、威胁检测和自动化响应方面展现出强大能力,下面我们就来看看15个提升... 目录一、PowerShell在网络安全中的战略价值二、网络安全关键场景命令实战1. 系统安全基线核查

Redis分片集群、数据读写规则问题小结

《Redis分片集群、数据读写规则问题小结》本文介绍了Redis分片集群的原理,通过数据分片和哈希槽机制解决单机内存限制与写瓶颈问题,实现分布式存储和高并发处理,但存在通信开销大、维护复杂及对事务支持... 目录一、分片集群解android决的问题二、分片集群图解 分片集群特征如何解决的上述问题?(与哨兵模

C++作用域和标识符查找规则详解

《C++作用域和标识符查找规则详解》在C++中,作用域(Scope)和标识符查找(IdentifierLookup)是理解代码行为的重要概念,本文将详细介绍这些规则,并通过实例来说明它们的工作原理,需... 目录作用域标识符查找规则1. 普通查找(Ordinary Lookup)2. 限定查找(Qualif

Nginx Location映射规则总结归纳与最佳实践

《NginxLocation映射规则总结归纳与最佳实践》Nginx的location指令是配置请求路由的核心机制,其匹配规则直接影响请求的处理流程,下面给大家介绍NginxLocation映射规则... 目录一、Location匹配规则与优先级1. 匹配模式2. 优先级顺序3. 匹配示例二、Proxy_pa

Spring Boot 集成 Solr 的详细示例

《SpringBoot集成Solr的详细示例》:本文主要介绍SpringBoot集成Solr的详细示例,本文通过实例代码给大家介绍的非常详细,感兴趣的朋友一起看看吧... 目录环境准备添加依赖配置 Solr 连接定义实体类编写 Repository 接口创建 Service 与 Controller示例运行

Nginx路由匹配规则及优先级详解

《Nginx路由匹配规则及优先级详解》Nginx作为一个高性能的Web服务器和反向代理服务器,广泛用于负载均衡、请求转发等场景,在配置Nginx时,路由匹配规则是非常重要的概念,本文将详细介绍Ngin... 目录引言一、 Nginx的路由匹配规则概述二、 Nginx的路由匹配规则类型2.1 精确匹配(=)2

Nginx location匹配模式与规则详解

《Nginxlocation匹配模式与规则详解》:本文主要介绍Nginxlocation匹配模式与规则,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教... 目录一、环境二、匹配模式1. 精准模式2. 前缀模式(不继续匹配正则)3. 前缀模式(继续匹配正则)4. 正则模式(大

详解nginx 中location和 proxy_pass的匹配规则

《详解nginx中location和proxy_pass的匹配规则》location是Nginx中用来匹配客户端请求URI的指令,决定如何处理特定路径的请求,它定义了请求的路由规则,后续的配置(如... 目录location 的作用语法示例:location /www.chinasem.cntestproxy

grom设置全局日志实现执行并打印sql语句

《grom设置全局日志实现执行并打印sql语句》本文主要介绍了grom设置全局日志实现执行并打印sql语句,包括设置日志级别、实现自定义Logger接口以及如何使用GORM的默认logger,通过这些... 目录gorm中的自定义日志gorm中日志的其他操作日志级别Debug自定义 Loggergorm中的

关于Gateway路由匹配规则解读

《关于Gateway路由匹配规则解读》本文详细介绍了SpringCloudGateway的路由匹配规则,包括基本概念、常用属性、实际应用以及注意事项,路由匹配规则决定了请求如何被转发到目标服务,是Ga... 目录Gateway路由匹配规则一、基本概念二、常用属性三、实际应用四、注意事项总结Gateway路由