[27] F. Zhou, T. Wang, T. Zhong, and G. Trajcevski, “Identifying user geolocation with hierarchical graph neural networks and explainable fusion,” Inf. Fusion, vol. 81, pp. 1–13, 2022. (用层次图、神经网络和可解
论文:Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models 代码:https://github.com/Ucas-HaoranWei/Vary 出处:旷视 时间:2023.12 一、背景 当前流行的大型视觉-语言模型 Large Vision-Language Models (LVLMs) 一般