论文标题:TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones 论文作者:Zhengqing Yuan, Zhaoxu Li, Lichao Sun 作者单位:Anhui Polytechnic University, Nanyang Technological University, Lehigh Un
论文标题:TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones 论文作者:Zhengqing Yuan, Zhaoxu Li, Lichao Sun 作者单位:Anhui Polytechnic University, Nanyang Technological University, Lehigh Un
论文标题:Improved Baselines with Visual Instruction Tuning 论文作者:Haotian Liu, Chunyuan Li, Yuheng Li, Yong Jae Lee 作者单位:University of Wisconsin-Madison, Microsoft Research, Columbia University 论文原文:https:
论文标题:LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing 论文作者:Wei-Ge Chen, Irina Spiridonova, Jianwei Yang, Jianfeng Gao, Chunyuan Li 作者单位:Microsoft Research, R
论文标题:Video-LLaVA: Learning United Visual Representation by Alignment Before Projection 论文作者:Bin Lin, Yang Ye, Bin Zhu, Jiaxi Cui, Munan Ning, Peng Jin, Li Yuan 作者单位:Peking University, Peng Cheng Labo
论文标题:LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents 论文作者:Shilong Liu, Hao Cheng, Haotian Liu, Hao Zhang, Feng Li, Tianhe Ren, Xueyan Zou, Jianwei Yang, Hang Su, Jun Zhu, Lei Zhang,
论文标题:Visual Instruction Tuning 论文作者:Haotian Liu, Chunyuan Li, Qingyang Wu, Yong Jae Lee 作者单位:University of Wisconsin-Madison, Microsoft Research, Columbia University 论文原文:https://arxiv.org/abs/2304.0
论文标题:LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day 论文作者:Chunyuan Li∗, Cliff Wong∗, Sheng Zhang∗, Naoto Usuyama, Haotian Liu, Jianwei Yang Tristan Naumann, Hoifu