VQA-ReGat 项目运行遇到的错误

本文主要是介绍VQA-ReGat 项目运行遇到的错误，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

VQA-ReGat:关系感知图形注意网络用于VQA

项目地址
论文地址

1.torch报错：StopIteration: Caught StopIteration in replica 0 on device 0.
原因：多GPU运行此项目报错，可能是torch版本错误。
修改：按照别的博客将 weight = next(self.parameters()).data改为weight = torch.float32
2.仍报错：AttributeError: 'torch.dtype' no attribute 'new'：torch.dtype没有new属性。
原因：因为1出的修改，weight是torch.dtype类，非torch.tensor数据。
修改：于是看源码只是想获取next(self.parameters()).data的数据类型，大部分都是cuda的torch.float32的类型，

因此最终修改：

weight = 0
weight = torch.tensor(weight,dtype=torch.float32)
weight = weight.cuda()

3.报错：RuntimeError: unsupported operation: more than one element of the written-to tensor refers to a single memory location. Please clone() the tensor before performing the operation
修改：q_expand = q.expand(*repeat_vals)改为 q_expand = q.expand(*repeat_vals).clone()
4.报错：RuntimeError: CUDA out of memory. Tried to allocate 292.00 MiB (GPU 0; 10.76 GiB total capacity; 4.34 GiB already allocat
可能本项目存在很多的parameters，所以设置小点的batch_size.

VQA-ReGat结果

只记录了3个epoch结果，

--------------mutan.json--------------------------------
epoch 10：train_loss: 2.69, norm: 2.9384, score: 71.8663eval score: 76.90 (92.66)-------------butd.json----------------------------------
epoch 28, time: 758.79train_loss: 2.53, norm: 2.6914, score: 73.75eval score: 75.40 (92.66)epoch 29, time: 765.29train_loss: 2.53, norm: 2.6903, score: 73.72eval score: 75.40 (92.66)

这篇关于VQA-ReGat 项目运行遇到的错误的文章就介绍到这儿，希望我们推荐的文章对编程师们有所帮助！