本文主要是介绍HuggingFace 代码加速秘诀,专门用于针对解决numpy矩阵太大,程序中断的问题,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
def align_predictions(predictions: np.ndarray, label_ids: np.ndarray) -> Tuple[List[int], List[int]]:print("predictions:", predictions)print("label_ids:", label_ids)# preds = np.argmax(predictions, axis=2) # ori,注释preds = predictions # 改,不用重复计算了print("preds:", preds)batch_size, seq_len = preds.shapeprint("batch_size:", batch_size)print("seq_len:", seq_len)out_label_list = [[] for _ in range(batch_size)]preds_list = [[] for _ in range(batch_size)]for i in range(batch_size):for j in range(seq_len):if label_ids[i, j] != nn.CrossEntropyLoss().ignore_index:out_label_list[i].append(label_map[label_ids[i][j]])preds_list[i].append(label_map[preds[i][j]])return preds_list, out_label_list
# 防止numpy矩阵过大,增加的函数,把数据提前运算def preprocess_logits_for_metrics(logits, labels):if isinstance(logits, tuple):# Depending on the model and config, logits may contain extra tensors,# like past_key_values, but logits always come firstlogits = logits[0]return logits.argmax(dim=-1)# Initialize our Trainertrainer = Trainer(model=model,args=training_args,train_dataset=train_dataset,eval_dataset=eval_dataset,compute_metrics=compute_metrics,preprocess_logits_for_metrics=preprocess_logits_for_metrics,)
当然,如果你能改虚拟内存,把硬盘的空间,切一部分出来,当虚拟内存(CPU内存的2倍),那代码运行速度会更快!
参考的设置方法:https://zhuanlan.zhihu.com/p/37332255
这篇关于HuggingFace 代码加速秘诀,专门用于针对解决numpy矩阵太大,程序中断的问题的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!