【数据分析】之ReGat的VQAFeaturesDataset加载

2024-01-16 01:58

本文主要是介绍【数据分析】之ReGat的VQAFeaturesDataset加载,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

1 .VQAFeatureDataset

此类是ReGat项目对torch自带的from torch.utils.data import Dataset的重写,是模型运行的时候训练集和测试集的加载,加载的数据是模型forward函数的参数。如下:
regat.forward():

    def forward(self, v, b, q, implicit_pos_emb, sem_adj_matrix,spa_adj_matrix, labels):"""Forwardv: [batch, num_objs, obj_dim]b: [batch, num_objs, b_dim]q: [batch_size, seq_length]pos: [batch_size, num_objs, nongt_dim, emb_dim]sem_adj_matrix: [batch_size, num_objs, num_objs, num_edge_labels]spa_adj_matrix: [batch_size, num_objs, num_objs, num_edge_labels]return: logits, not probs"""w_emb = self.w_emb(q) #问题嵌入q_emb_seq = self.q_emb.forward_all(w_emb)  # [batch, q_len, q_dim]q_emb_self_att = self.q_att(q_emb_seq)  #添加自注意力信息# [batch_size, num_rois, out_dim]if self.relation_type == "semantic": #如果关系类型是语义v_emb = self.v_relation.forward(v, sem_adj_matrix, q_emb_self_att)elif self.relation_type == "spatial": #如果关系类型是空间v_emb = self.v_relation.forward(v, spa_adj_matrix, q_emb_self_att)else:  # implicit #如果是隐式关系v_emb = self.v_relation.forward(v, implicit_pos_emb,q_emb_self_att)if self.fusion == "ban": #融合模型1joint_emb, att = self.joint_embedding(v_emb, q_emb_seq, b)elif self.fusion == "butd": #融合模型2q_emb = self.q_emb(w_emb)  # [batch, q_dim]joint_emb, att = self.joint_embedding(v_emb, q_emb)else:  # mutan融合模型3joint_emb, att = self.joint_embedding(v_emb, q_emb_self_att)if self.classifier: #分类模型logits = self.classifier(joint_emb)else: logits = joint_embreturn logits, att

VQAFeatureDataset

self中的变量

变量名含义来源
self.ans2label单词-索引表示:字典trainval_ans2label.pkl{‘net’: 0, ‘pitcher’: 1, ‘orange’: 2, ‘yes’: 3, ‘white’: 4,…
self.label2ans索引-单词表示:列表trainval_label2ans.pkl[‘net’, ‘pitcher’, ‘orange’, ‘yes’, ‘white’,.
self.num_ans_candidates答案单词候选数:intlen(self.ans2label)3129
self.img_id2idx图像id-索引表示:字典imgid2idx.pkl{218224: 0, 306670: 1, 208663: 2, 225177: 3, 467257: 4, .
self.features图像特征:Tensorhf.get(‘image_features’)tensor[40504,36,2048]
self.normalized_bb标准化区域边界框空间位置:Tensorhf.get(‘spatial_features’)tensor([40504, 36, 4])
self.bb区域边界框位置:Tensorhf.get(‘image_bb’)tensor[40504,36,4]
self.semantic_adj_matrix语义形容词矩阵如果在hf的键中:hf.get(‘semantic_adj_matrix’) ,不在=None
self.spatial_adj_matrix空间形容词矩阵如果在hf的键中:hf.get(‘image_adj_matrix’) ,不在=None
self.pos_boxesNoneNoneNone
self.entries数据条目,items:list_load_dataset(dataroot, name, self.img_id2idx,self.label2ans)长度214354
self.nongt_dimself.nongt_dim = nongt_dim36
self.emb_dim位置嵌入维度pos_emb_dim64
self.v_dim图像特征嵌入维度self.features.size(1 if self.adaptive else 2)2048
self.s_dim方向维度self.normalized_bb.size(1 if self.adaptive else 2)6
class VQAFeatureDataset(Dataset):def __init__(self, name, dictionary, relation_type, dataroot='data',adaptive=False, pos_emb_dim=64, nongt_dim=36):super(VQAFeatureDataset, self).__init__()assert name in ['train', 'val', 'test-dev2015', 'test2015']# 加载annotations.json的预处理后的pkl文件ans2label_path = os.path.join(dataroot, 'cache', 'trainval_ans2label.pkl')label2ans_path = os.path.join(dataroot, 'cache', 'trainval_label2ans.pkl')self.ans2label = pickle.load(open(ans2label_path, 'rb')) #形如{'w1':1,'w2':2,...,'w3129':3129}self.label2ans = pickle.load(open(label2ans_path, 'rb'))#['w1','w2',...,'w3129']self.num_ans_candidates = len(self.ans2label) #候选答案单词数目=3129self.dictionary = dictionary #词典,包含19901个单词,键:idx2word#['w1','w2',...,'w19901'],word2idx{'w1':1,'w2':2,...,'w19901':19901},padding_idx=19901,ntoken=19901self.relation_type = relation_typeself.adaptive = adaptive #数据集是否是自适应的10-100个区域的prefix = '36'if 'test' in name:prefix = '_36'#加载hdf5文件目录h5_dataroot = dataroot+"/Bottom-up-features-adaptive"\if self.adaptive else dataroot+"/Bottom-up-features-fixed"imgid_dataroot = dataroot+"/imgids" #加载图像ids文件:#加载imgid2idx.pkl文件,保存再self.img_id2idx里:{id1:1,id2:2,...,id40504:40504}self.img_id2idx = pickle.load(open(os.path.join(imgid_dataroot, '%s%s_imgid2idx.pkl' %(name, '' if self.adaptive else prefix)), 'rb'))#加载hdf5文件h5_path = os.path.join(h5_dataroot, '%s%s.hdf5' %(name, '' if self.adaptive else prefix))print('loading features from h5 file %s' % h5_path)with h5py.File(h5_path, 'r') as hf:# self.features = np.array(hf.get('image_features'))self.features = np.array(hf.get('image_features'),dtype='float32')self.normalized_bb = np.array(hf.get('spatial_features'),dtype='float32')self.bb = np.array(hf.get('image_bb'),dtype='float32')print("hdf5数据加载成功!")if "semantic_adj_matrix" in hf.keys() \and self.relation_type == "semantic":self.semantic_adj_matrix = np.array(hf.get('semantic_adj_matrix'))print("Loaded semantic adj matrix from file...",self.semantic_adj_matrix.shape)else:self.semantic_adj_matrix = Noneprint("Setting semantic adj matrix to None...")if "image_adj_matrix" in hf.keys()\and self.relation_type == "spatial":self.spatial_adj_matrix = np.array(hf.get('image_adj_matrix'))#从文件加载空间的形容词矩阵print("Loaded spatial adj matrix from file...",self.spatial_adj_matrix.shape)else:self.spatial_adj_matrix = Noneprint("Setting spatial adj matrix to None...")self.pos_boxes = Noneif self.adaptive:self.pos_boxes = np.array(hf.get('pos_boxes'),dtype='float32')self.entries = _load_dataset(dataroot, name, self.img_id2idx,self.label2ans)self.tokenize()print("数据加载成功!")self.tensorize()self.nongt_dim = nongt_dimself.emb_dim = pos_emb_dimself.v_dim = self.features.size(1 if self.adaptive else 2)self.s_dim = self.normalized_bb.size(1 if self.adaptive else 2)def tokenize(self, max_length=14):"""Tokenizes the questions.This will add q_token in each entry of the dataset.-1 represent nil, and should be treated as padding_idx in embedding"""for entry in self.entries:tokens = self.dictionary.tokenize(entry['question'], False)tokens = tokens[:max_length]if len(tokens) < max_length:# Note here we pad to the back of the sentencepadding = [self.dictionary.padding_idx] * \(max_length - len(tokens))tokens = tokens + paddingutils.assert_eq(len(tokens), max_length)entry['q_token'] = tokensdef tensorize(self):self.features = torch.from_numpy(self.features)self.normalized_bb = torch.from_numpy(self.normalized_bb)self.bb = torch.from_numpy(self.bb)if self.semantic_adj_matrix is not None:self.semantic_adj_matrix = torch.from_numpy(self.semantic_adj_matrix).double()if self.spatial_adj_matrix is not None:self.spatial_adj_matrix = torch.from_numpy(self.spatial_adj_matrix).double()if self.pos_boxes is not None:self.pos_boxes = torch.from_numpy(self.pos_boxes)for entry in self.entries:question = torch.from_numpy(np.array(entry['q_token']))entry['q_token'] = questionanswer = entry['answer']if answer is not None:labels = np.array(answer['labels'])scores = np.array(answer['scores'], dtype=np.float32)if len(labels):labels = torch.from_numpy(labels)scores = torch.from_numpy(scores)entry['answer']['labels'] = labelsentry['answer']['scores'] = scoreselse:entry['answer']['labels'] = Noneentry['answer']['scores'] = Nonedef __getitem__(self, index):entry = self.entries[index]raw_question = entry["question"]image_id = entry["image_id"]question = entry['q_token']question_id = entry['question_id']if self.spatial_adj_matrix is not None:spatial_adj_matrix = self.spatial_adj_matrix[entry["image"]]else:spatial_adj_matrix = torch.zeros(1).double()if self.semantic_adj_matrix is not None:semantic_adj_matrix = self.semantic_adj_matrix[entry["image"]]else:semantic_adj_matrix = torch.zeros(1).double()if not self.adaptive:# fixed number of bounding boxesfeatures = self.features[entry['image']]normalized_bb = self.normalized_bb[entry['image']]bb = self.bb[entry["image"]]else:features = self.features[self.pos_boxes[entry['image']][0]:self.pos_boxes[entry['image']][1], :]normalized_bb = self.normalized_bb[self.pos_boxes[entry['image']][0]:self.pos_boxes[entry['image']][1], :]bb = self.bb[self.pos_boxes[entry['image']][0]:self.pos_boxes[entry['image']][1], :]answer = entry['answer']if answer is not None:labels = answer['labels']scores = answer['scores']target = torch.zeros(self.num_ans_candidates)if labels is not None:target.scatter_(0, labels, scores)return features, normalized_bb, question, target,\question_id, image_id, bb, spatial_adj_matrix,\semantic_adj_matrixelse:return features, normalized_bb, question, question_id,\question_id, image_id, bb, spatial_adj_matrix,\semantic_adj_matrixdef __len__(self):return len(self.entries)

entries
entries是数据的条目,类型是list,共214354条数据,每条数据是一个字典。
每条数据如下:

键值含义
question_id问题id42000
image_id图像id42
image图像37244
question问题文字表示‘What color are the gym shoes?’
answer答案:label,score{‘labels’: tensor([ 4, 1594], dtype=torch.int32), ‘scores’: tensor([1.0000, 0.3000])}
q_token问题索引向量表示tensor([ 0, 10, 68, 11, 2618, 225, 19901, 19901, 19901, 19901,19901, 19901, 19901, 19901], dtype=torch.int32)

2. 模型需要传入的getitem数据返回 :如果是固定36个区域

变量名来源
featuresself.features[entry[‘image’]],此处的image是位置
normalized_bbself.normalized_bb[entry[‘image’]]
questionentry[‘q_token’]
targetscatter_(0, labels, scores)
question_identry[‘question_id’]
image_identry[“image_id”]
bbself.bb[entry[“image”]]
spatial_adj_matrixself.spatial_adj_matrix[entry[“image”]]
semantic_adj_matrixself.semantic_adj_matrix[entry[“image”]]

这篇关于【数据分析】之ReGat的VQAFeaturesDataset加载的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/610958

相关文章

Spring Boot 配置文件之类型、加载顺序与最佳实践记录

《SpringBoot配置文件之类型、加载顺序与最佳实践记录》SpringBoot的配置文件是灵活且强大的工具,通过合理的配置管理,可以让应用开发和部署更加高效,无论是简单的属性配置,还是复杂... 目录Spring Boot 配置文件详解一、Spring Boot 配置文件类型1.1 applicatio

SpringBoot项目启动报错"找不到或无法加载主类"的解决方法

《SpringBoot项目启动报错找不到或无法加载主类的解决方法》在使用IntelliJIDEA开发基于SpringBoot框架的Java程序时,可能会出现找不到或无法加载主类com.example.... 目录一、问题描述二、排查过程三、解决方案一、问题描述在使用 IntelliJ IDEA 开发基于

Android WebView无法加载H5页面的常见问题和解决方法

《AndroidWebView无法加载H5页面的常见问题和解决方法》AndroidWebView是一种视图组件,使得Android应用能够显示网页内容,它基于Chromium,具备现代浏览器的许多功... 目录1. WebView 简介2. 常见问题3. 网络权限设置4. 启用 JavaScript5. D

SpringBoot项目启动错误:找不到或无法加载主类的几种解决方法

《SpringBoot项目启动错误:找不到或无法加载主类的几种解决方法》本文主要介绍了SpringBoot项目启动错误:找不到或无法加载主类的几种解决方法,具有一定的参考价值,感兴趣的可以了解一下... 目录方法1:更改IDE配置方法2:在Eclipse中清理项目方法3:使用Maven命令行在开发Sprin

spring-boot-starter-thymeleaf加载外部html文件方式

《spring-boot-starter-thymeleaf加载外部html文件方式》本文介绍了在SpringMVC中使用Thymeleaf模板引擎加载外部HTML文件的方法,以及在SpringBoo... 目录1.Thymeleaf介绍2.springboot使用thymeleaf2.1.引入spring

关于Spring @Bean 相同加载顺序不同结果不同的问题记录

《关于Spring@Bean相同加载顺序不同结果不同的问题记录》本文主要探讨了在Spring5.1.3.RELEASE版本下,当有两个全注解类定义相同类型的Bean时,由于加载顺序不同,最终生成的... 目录问题说明测试输出1测试输出2@Bean注解的BeanDefiChina编程nition加入时机总结问题说明

SpringBoot项目启动后自动加载系统配置的多种实现方式

《SpringBoot项目启动后自动加载系统配置的多种实现方式》:本文主要介绍SpringBoot项目启动后自动加载系统配置的多种实现方式,并通过代码示例讲解的非常详细,对大家的学习或工作有一定的... 目录1. 使用 CommandLineRunner实现方式:2. 使用 ApplicationRunne

SpringBoot项目删除Bean或者不加载Bean的问题解决

《SpringBoot项目删除Bean或者不加载Bean的问题解决》文章介绍了在SpringBoot项目中如何使用@ComponentScan注解和自定义过滤器实现不加载某些Bean的方法,本文通过实... 使用@ComponentScan注解中的@ComponentScan.Filter标记不加载。@C

springboot 加载本地jar到maven的实现方法

《springboot加载本地jar到maven的实现方法》如何在SpringBoot项目中加载本地jar到Maven本地仓库,使用Maven的install-file目标来实现,本文结合实例代码给... 在Spring Boothttp://www.chinasem.cn项目中,如果你想要加载一个本地的ja

最好用的WPF加载动画功能

《最好用的WPF加载动画功能》当开发应用程序时,提供良好的用户体验(UX)是至关重要的,加载动画作为一种有效的沟通工具,它不仅能告知用户系统正在工作,还能够通过视觉上的吸引力来增强整体用户体验,本文给... 目录前言需求分析高级用法综合案例总结最后前言当开发应用程序时,提供良好的用户体验(UX)是至关重要