论文链接:Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering Bottom-Up Attention Model 本文的bottom up attention 模型在后面的image caption部分和VQA部分都会被用到。 这里用的是object detection领域
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering. 2018-CVPR P. Anderson, X. He, C. Buehler, D. Teney, M. Johnson, S. Gould, and L. Zhang. 什么是“自上而下”,“自下而上”? 类比人类视觉