《搜索和推荐中的深度匹配》—

本文主要是介绍《搜索和推荐中的深度匹配》——2.5 延伸阅读，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

重磅推荐专栏：《Transformers自然语言处理系列教程》
手把手带你深入实践Transformers，轻松构建属于自己的NLP智能应用！

Query重构是解决搜索中查询文档不匹配的另一种方法，即将Query转换为另一个可以进行更好匹配的Query。Query转换包括Query的拼写错误更正。例如，【1】提出了一种源渠道模型，【2】提出了一种用于该任务的判别方法。Query转换还包括Query分段【3】【4】【5】。受统计机器翻译 (SMT) 的启发，研究人员还考虑利用翻译技术来处理Query文档不匹配问题，假设Query使用一种语言而文档使用另一种语言。【6】利用基于单词的翻译模型来执行任务。【7】提出使用基于短语的翻译模型来捕获查询中单词和文档标题之间的依赖关系。主题模型也可用于解决不匹配问题。一种简单而有效的方法是使用term匹配分数和主题匹配分数的线性组合【8】。概率主题模型也用于平滑文档语言模型（或Query语言模型）【9】【10】。【11】对搜索中语义匹配的传统机器学习方法进行了全面调查。

在推荐方面，除了引入的经典潜在因子模型外，还开发了其他类型的方法。例如，可以使用预先定义的启发式在原始交互空间上进行匹配，例如基于项目的 CF【12】和统一的基于用户和基于项目的 CF【13】。用户-项目交互可以组织为二部图，在该图上执行随机游走以估计任意两个节点（一个用户和一个项目、两个用户或两个项目）之间的相关性【14】【15】。还可以使用概率图模型【16】对用户-项目交互的生成过程进行建模。为了结合各种辅助信息，例如用户配置文件和上下文，除了引入的 FM 模型外，还利用了张量分解【17】和集体矩阵分解【18】。我们向读者推荐了两篇关于传统推荐匹配方法的调查论文【19】【20】。

引文

【1】Brill, E. and R. C. Moore (2000). “An improved error model for noisy channel spelling correction”. In: Proceedings of the 38th Annual Meeting on Association for Computational Linguistics. ACL ’00. Hong Kong: Association for Computational Linguistics. 286–293.
【2】Wang, Z., G. Xu, H. Li, and M. Zhang (2011). “A fast and accurate method for approximate string search”. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies – Volume 1. HLT ’11. Portland, OR, USA: Association for Computational Linguistics. 52–61. url: http://dl.acm.org/citation.cf m?id=2002472.2002480.
【3】Bendersky, M., W. B. Croft, and D. A. Smith (2011). “Joint annotation of search queries”. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language
Technologies – Volume 1. HLT ’11. Portland, OR, USA: Association for Computational Linguistics. 102–111. url: http://dl.acm.org/ citation.cf m?id=2002472.2002486.
【4】Bergsma, S. and Q. I. Wang (2007). “Learning noun phrase query segmentation”. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computa- tional Natural Language Learning (EMNLP-CoNLL). Prague, Czech Republic: Association for Computational Linguistics. 819–826. url: https://www.aclweb.org/anthology/D07-1086.
【5】Guo, J., G. Xu, H. Li, and X. Cheng (2008). “A unified and discrimina-
tive model for query refinement”. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR ’08. Singapore, Singapore: ACM. 379–386.
【6】Berger, A. and J. Lafferty (1999). “Information retrieval as statistical translation”. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR ’99. Berkeley, CA, USA: ACM. 222–229.
【7】Gao, J., J.-Y. Nie, G. Wu, and G. Cao (2004). “Dependence language
model for information retrieval”. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR ’04. Sheffield, UK: ACM. 170–177.
【8】Hofmann, T. (1999). “Probabilistic latent semantic indexing”. In: Pro- ceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR ’99. Berkeley, CA, USA: ACM. 50–57.
【9】Wei, X. and W. B. Croft (2006). “LDA-based document models for ad- hoc retrieval”. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR ’06. Seattle, Washington, DC, USA: ACM. 178– 185.
【10】Yi, X. and J. Allan (2009). “A comparative study of utilizing topic mod- els for information retrieval”. In: Proceedings of the 31th European
Conference on IR Research on Advances in Information Retrieval. ECIR ’09. Toulouse, France: Springer-Verlag. 29–41.
【11】Li.H. and J. Xu (2014). “Semantic matching in search”. Foundations and Trends in Information Retrieval. 7(5): 343–469.
【12】Sarwar, B., G. Karypis, J. Konstan, and J. Riedl (2001). “Item-based collaborative filtering recommendation algorithms”. In: Proceedings of the 10th International Conference on World Wide Web. WWW
’01. Hong Kong, Hong Kong: ACM. 285–295.
【13】Wang, J., A. P. de Vries, and M. J. T. Reinders (2006). “Unifying user- based and item-based collaborative filtering approaches by similarity fusion”. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR ’06. Seattle, Washington, DC, USA: ACM. 501– 508.
【14】Eksombatchai, C., P. Jindal, J. Z. Liu, Y. Liu, R. Sharma, C. Sugnet, M. Ulrich, and J. Leskovec (2018). “Pixie: A system for recommending 3+ Billion items to 200+ Million users in real-time”. In: Proceedings of the 2018 World Wide Web Conference on World Wide Web,
WWW 2018, Lyon, France. 1775–1784.
【15】He, X., M. Gao, M.-Y. Kan, and D. Wang (2017b). “BiRank: Towards ranking on bipartite graphs”. IEEE Transactions on Knowledge and
Data Engineering. 29(1): 57–71.
【16】Salakhutdinov, R. and A. Mnih (2007). “Probabilistic matrix factor- ization”. In: Proceedings of the 20th International Conference on
Neural Information Processing Systems. NIPS’07. Vancouver, British Columbia, Canada: Curran Associates Inc. 1257–1264. url: http:// dl.acm.org/citation.cf m?id=2981562.2981720.
【17】Karatzoglou, A., X. Amatriain, L. Baltrunas, and N. Oliver (2010). “Multiverse recommendation: N-dimensional tensor factorization for context-aware collaborative filtering”. In: Proceedings of the Fourth
ACM Conference on Recommender Systems. RecSys ’10. Barcelona,
Spain: ACM. 79–86.
【18】He, X., M.-Y. Kan, P. Xie, and X. Chen (2014). “Comment-based multi-view clustering of web 2.0 items”. In: Proceedings of the 23rd International Conference on World Wide Web. WWW ’14. Seoul, Korea: ACM. 771–782.
【19】Adomavicius, G. and A. Tuzhilin (2005). “Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions”. IEEE Transactions on Knowledge and Data Engineering. 17(6): 734–749.
【20】Shi, Y., M. Larson, and A. Hanjalic (2014). “Collaborative filtering
beyond the user-item matrix: A survey of the state of the art and
future challenges”. ACM Computing Surveys. 47(1): 3:1–3:45.