本文主要是介绍VQA-ReGat 项目运行遇到的错误,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
VQA-ReGat:关系感知图形注意网络用于VQA
项目地址
论文地址
-
1.torch报错:
StopIteration: Caught StopIteration in replica 0 on device 0.
原因:多GPU运行此项目报错,可能是torch版本错误。
修改:按照别的博客将weight = next(self.parameters()).data
改为weight = torch.float32
-
2.仍报错:
AttributeError: 'torch.dtype' no attribute 'new'
:torch.dtype没有new属性。
原因:因为1出的修改,weight是torch.dtype类,非torch.tensor数据。
修改:于是看源码只是想获取next(self.parameters()).data
的数据类型,大部分都是cuda的torch.float32的类型,
因此最终修改:
weight = 0
weight = torch.tensor(weight,dtype=torch.float32)
weight = weight.cuda()
-
3.报错:
RuntimeError: unsupported operation: more than one element of the written-to tensor refers to a single memory location. Please clone() the tensor before performing the operation
修改:q_expand = q.expand(*repeat_vals)
改为q_expand = q.expand(*repeat_vals).clone()
-
4.报错:
RuntimeError: CUDA out of memory. Tried to allocate 292.00 MiB (GPU 0; 10.76 GiB total capacity; 4.34 GiB already allocat
可能本项目存在很多的parameters,所以设置小点的batch_size.
VQA-ReGat结果
只记录了3个epoch结果,
--------------mutan.json--------------------------------
epoch 10:train_loss: 2.69, norm: 2.9384, score: 71.8663eval score: 76.90 (92.66)-------------butd.json----------------------------------
epoch 28, time: 758.79train_loss: 2.53, norm: 2.6914, score: 73.75eval score: 75.40 (92.66)epoch 29, time: 765.29train_loss: 2.53, norm: 2.6903, score: 73.72eval score: 75.40 (92.66)
这篇关于VQA-ReGat 项目运行遇到的错误的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!