Interactive Path Reasoning on Graph for Conversational Recommendation阅读笔记

本文主要是介绍Interactive Path Reasoning on Graph for Conversational Recommendation阅读笔记,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

1.首先理解这种任务型对话推荐

1.1过程

在这里插入图片描述
在这里插入图片描述
这一类型的对话推荐相较于传统对话推荐的一个显著优势在于:directly ask users about their preferred attributes on items(traditional methods suffer from the intrinsic limitation of passively acquiring user feedback in the process of making recommendations. )

1.2 需要解决的三个基本问题:

  1. what questions to ask regarding item attributes,
  2. when to recommend items,
  3. how to adapt to the users’ online feedback. To the best of our knowledge, there lacks a unified framework that addresses these problems

1.3 基于的两个假设

(1)It assumes that the user clearly expresses his preferences by specifying attributes without any reservations, and the items containing the preferred attributes are enough in the dataset

(2)It assumes that the CRS does not handle strong negative feedback.(This means, if a user rejects the asked attribute, the CRS does not distinguish whether the user does not care it or hates it. It is because such negative feedback is hard to obtain in current data, making it difficult to simulate in experimental surroundings. Therefore, the CRS equally treats all rejected attributes as does not care and only removes the attributes from the candidate set without further actions like removing all items that contain the rejected attributes.)

2.回顾一下这篇文章的baseline——EAR

该算法分为三个过程:

2.1 Estimation
在这一阶段the RC ranks candidate items and item attributes for the user, so as to support the action decision of the CC.

2.2 Action
the CC decides whether to choose an attribute to ask, or make a recommendation according to the ranked candidates and attributes, and the dialogue history.
If the user likes the attribute asked by the RC, the CC feeds this attribute back to the RC to make a new estimation again; otherwise, the system stays at the action stage: updates the dialogue history and chooses another action. Once a recommendation is rejected by a user, the CC sends the rejected items back to RC, triggering the reflection stage where the RC adjusts its estimations. After that, the system enters the estimation stage again.

2.3 Reflection
reflection是在action阶段被触发的。

3.文章摘要

CRS的一个优势是其可以“directly ask users about their preferred attributes on items.”,然而现有的CRS算法没有充分利用这一优势:“they only use the attribute feedback in rather implicit ways such as updating the latent user representation”。

本文utilizing the user preferred attributes in an explicit way:we proposeConversational Path Reasoning (CPR), a generic framework that models conversational recommendation as an interactive path reasoning problem on a graph.
By leveraging on the graph structure, CPR is able to prune off many irrelevant candidate attributes, leading to better chance of hitting user preferred attributes.

小结:
The key hypothesis of this work is that, a more explicit way of utilizing the attribute preference can better carry forward the advantages of CRS — being more accurate and explainable

4.方法概述(作者是如何在工作中利用图结构的?)

A conversation session in our CPR is expressed as a walking in the graph. It starts from the user vertex, and travels in the graph with the goal to reach one or multiple item vertices the user likes as the destination. Note that the walking is navigated by users through conversation. This means, at each step, a system needs to interact with the user to find out which vertex to go and takes actions according to user’s response.在这里插入图片描述

5.Contributions

(1)We propose the CPR framework to model conversational recommendation as a path reasoning problem on a heterogeneous graph which provides a new angle of building CRS. To the best of our knowledge, it is the first time to introduce graph-based reasoning to multi-round conversational recommendation.
(2)To demonstrate the effectiveness of CPR, we provide a simple instantiation SCPR, which outperforms existing methods in various settings. We find that, the larger attribute space is, the more improvements our model can achieve.

6.算法详解

The system treats attributes as the preference feedback. To explicitly utilize these feedback, CPR performs the walking (i.e., reasoning) over the attribute vertices. Specifically,CPR maintains an active path P, comprising the attributes confirmed by a user (i.e., all attributes in P_u) in the chronological order, and exploring on the graph for the next attribute vertex to walk.

Now, we move to the detailed walking process in CPR. Assumeth e current active path is P = p0,p1,p2, …,pt. The system stays at ptand is going to find the next attribute vertex to walk. This process can be decomposed into three steps: reasoning, consultation and transition.

6.1 Reasoning

与EAR模型相同,这一step也是为score items and attributes。
其中 items score也与EAR模型相同:
(但是这里利用了图信息:文章将与一个path直接相连的items作为candidate items )

在这里插入图片描述
在这里插入图片描述
相对于EAR模型的改进在于attribute scores:
(1)其利用图结构中的邻接结点,缩小了候选attributes的空间。
(2)其在打分过程中还使用了候选items
(The idea is that, with updated scores (i.e.,s_v) calculated in the first step,the items provide additional information to find proper attributes to consult the user. An expected strategy is to find the onethat can better eliminate the uncertainty of items.)
在这里插入图片描述在这里插入图片描述
(这里采用了信息熵的方法,因为我们希望选出来的p可以减小在之后选取item时的不确定性,information entropy has proven to be an effective method of uncertainty estimation [27])

6.2 Consultation

这里也有对EAR模型的改进,EAR中搜索空间包含所有的attributes(因为其要在强化学习过程中学到要询问哪一个attribute),而这里作者搜索空间为2。

decide whether to ask an attribute or to recommend items

the standard Deep Q-learning——a two-layer feed forward neural network. The policy network takes the state vector s as input and outputs the values Q(s,a) for the two actions, indicating the estimated reward for a a s k a_{ask} aask or a r e c a_{rec} arec

6.3 Transition

(1)if the user confirms an asked attribute p t p_t pt,
在这里插入图片描述

7.模型的优势

(1)It is crystally explainable. It models conversational recommendation as an interactive path reasoning problem on the graph, with each step confirmed by the user. Thus, the resultant path is the correct reason for the recommendation. This makes better use of the fine-grained attribute preference than existing methods that only model attribute preference in latent space

对最后一句话的解释说明:EAR模型feed the preferred attribute into a variant of factorization machine [20] to score items in the latent space.这属于隐式地对用户反馈的attribute进行利用。

(2)It facilitates the exploitation of the abundant information by introducing the graph structure. By limiting the candidate attributes to ask as adjacent attributes of the current vertex, the candidate space is largely reduced, leading to a significant advantage compared with existing CRS methods like [13, 24] that treat almost all attributes as the candidates

(3)It is an aesthetically appealing framework which demonstrates the natural combination and mutual promotion of conversation system and recommendation system. On one hand, the path walking over the graph provides a natural dialogue state tracking for conversation system, and it is believed to be efficient to make the conversation more logically coherent [12, 14]; on the other hand,being able to directly solicit attribute feedback from the user, the conversation provides a shortcut to prune off searching branches in the graph

8.模型训练

注意数据集只是静态的交互数据,而不包含对话,所以理解一下这里的训练过程。
(1)offline training
An offline training for scoring function of item in reasoning step. We use the historical clicking record in the training set to optimize our factorization machine offline (Eq. (3)). The goal is to assign higher score to the clicked item for each users.
注释:作者在这里采用了EAR模型的训练过程;
在这里插入图片描述
(2)online training
An online training for reinforcement learning
used in consultation step. We use a user simulator (c.f. Sec 5.2.2) to interact with the user to train the policy network using the validation set.

在这里插入图片描述
在这里插入图片描述

9.Conclusion and FUTURE WORK

We are the first to introduce graph to address the multi-round
conversational recommendation problem, and propose the Conversational Path Reasoning (CPR) framework. CPR synchronizes conversation with the graph-based path reasoning, making the utilization of attribute more explicitly hence greatly improving explainability for conversational recommendation. Specifically, it tackles what item to recommend and what attribute to ask problems through message propagation on the graph, leveraging on the complex interaction between attributes and items in the graph to better rank items and attributes. Using the graph structure, a CRS only transits to the adjacent attribute, reducing the attribute candidate space and also improving the coherence of the conversation. Also, since the items and attributes have been ranked during the message propagation, the policy network only needs to decide when to ask and when to recommend, reducing the action space to be 2. It relieves the modeling load of the policy network, enabling it to be more robust especially when the candidate space is large.
There are many interesting problems to be explored for CPR. First, CPR framework itself can be further improved. For example,CPR does not consider how to adapt the model when the user rejects a recommended item. How to effectively consider such rejected items would be an interesting and challenging task. Second, more sophisticated implementation can be considered. For example, it is possible to build more expressive models for attribute scoring other than the weighted max-entropy as adopted in this paper. Currently, the embeddings of items and attributes do not get updated during the interactive training. It would be better to build a more holistic model to incorporate the user feedback to update all parameters in the model, inclusive of user, item and attribute embeddings.

这篇关于Interactive Path Reasoning on Graph for Conversational Recommendation阅读笔记的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/816701

相关文章

JAVA智听未来一站式有声阅读平台听书系统小程序源码

智听未来,一站式有声阅读平台听书系统 🌟 开篇:遇见未来,从“智听”开始 在这个快节奏的时代,你是否渴望在忙碌的间隙,找到一片属于自己的宁静角落?是否梦想着能随时随地,沉浸在知识的海洋,或是故事的奇幻世界里?今天,就让我带你一起探索“智听未来”——这一站式有声阅读平台听书系统,它正悄悄改变着我们的阅读方式,让未来触手可及! 📚 第一站:海量资源,应有尽有 走进“智听

【学习笔记】 陈强-机器学习-Python-Ch15 人工神经网络(1)sklearn

系列文章目录 监督学习:参数方法 【学习笔记】 陈强-机器学习-Python-Ch4 线性回归 【学习笔记】 陈强-机器学习-Python-Ch5 逻辑回归 【课后题练习】 陈强-机器学习-Python-Ch5 逻辑回归(SAheart.csv) 【学习笔记】 陈强-机器学习-Python-Ch6 多项逻辑回归 【学习笔记 及 课后题练习】 陈强-机器学习-Python-Ch7 判别分析 【学

系统架构师考试学习笔记第三篇——架构设计高级知识(20)通信系统架构设计理论与实践

本章知识考点:         第20课时主要学习通信系统架构设计的理论和工作中的实践。根据新版考试大纲,本课时知识点会涉及案例分析题(25分),而在历年考试中,案例题对该部分内容的考查并不多,虽在综合知识选择题目中经常考查,但分值也不高。本课时内容侧重于对知识点的记忆和理解,按照以往的出题规律,通信系统架构设计基础知识点多来源于教材内的基础网络设备、网络架构和教材外最新时事热点技术。本课时知识

论文阅读笔记: Segment Anything

文章目录 Segment Anything摘要引言任务模型数据引擎数据集负责任的人工智能 Segment Anything Model图像编码器提示编码器mask解码器解决歧义损失和训练 Segment Anything 论文地址: https://arxiv.org/abs/2304.02643 代码地址:https://github.com/facebookresear

数学建模笔记—— 非线性规划

数学建模笔记—— 非线性规划 非线性规划1. 模型原理1.1 非线性规划的标准型1.2 非线性规划求解的Matlab函数 2. 典型例题3. matlab代码求解3.1 例1 一个简单示例3.2 例2 选址问题1. 第一问 线性规划2. 第二问 非线性规划 非线性规划 非线性规划是一种求解目标函数或约束条件中有一个或几个非线性函数的最优化问题的方法。运筹学的一个重要分支。2

【C++学习笔记 20】C++中的智能指针

智能指针的功能 在上一篇笔记提到了在栈和堆上创建变量的区别,使用new关键字创建变量时,需要搭配delete关键字销毁变量。而智能指针的作用就是调用new分配内存时,不必自己去调用delete,甚至不用调用new。 智能指针实际上就是对原始指针的包装。 unique_ptr 最简单的智能指针,是一种作用域指针,意思是当指针超出该作用域时,会自动调用delete。它名为unique的原因是这个

查看提交历史 —— Git 学习笔记 11

查看提交历史 查看提交历史 不带任何选项的git log-p选项--stat 选项--pretty=oneline选项--pretty=format选项git log常用选项列表参考资料 在提交了若干更新,又或者克隆了某个项目之后,你也许想回顾下提交历史。 完成这个任务最简单而又有效的 工具是 git log 命令。 接下来的例子会用一个用于演示的 simplegit

记录每次更新到仓库 —— Git 学习笔记 10

记录每次更新到仓库 文章目录 文件的状态三个区域检查当前文件状态跟踪新文件取消跟踪(un-tracking)文件重新跟踪(re-tracking)文件暂存已修改文件忽略某些文件查看已暂存和未暂存的修改提交更新跳过暂存区删除文件移动文件参考资料 咱们接着很多天以前的 取得Git仓库 这篇文章继续说。 文件的状态 不管是通过哪种方法,现在我们已经有了一个仓库,并从这个仓

忽略某些文件 —— Git 学习笔记 05

忽略某些文件 忽略某些文件 通过.gitignore文件其他规则源如何选择规则源参考资料 对于某些文件,我们不希望把它们纳入 Git 的管理,也不希望它们总出现在未跟踪文件列表。通常它们都是些自动生成的文件,比如日志文件、编译过程中创建的临时文件等。 通过.gitignore文件 假设我们要忽略 lib.a 文件,那我们可以在 lib.a 所在目录下创建一个名为 .gi

取得 Git 仓库 —— Git 学习笔记 04

取得 Git 仓库 —— Git 学习笔记 04 我认为, Git 的学习分为两大块:一是工作区、索引、本地版本库之间的交互;二是本地版本库和远程版本库之间的交互。第一块是基础,第二块是难点。 下面,我们就围绕着第一部分内容来学习,先不考虑远程仓库,只考虑本地仓库。 怎样取得项目的 Git 仓库? 有两种取得 Git 项目仓库的方法。第一种是在本地创建一个新的仓库,第二种是把其他地方的某个