本文主要是介绍colab中数据集保存到drive与取出的方法,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
from google.colab import drive
drive.mount('/content/drive')
一、下载数据集
from datasets import load_dataset
max_length = 32 # Maximum length of the captions in tokens
coco_dataset_ratio = 50 # 50% of the COCO2014 dataset# Load the COCO2014 dataset for training, validation, and testing splits
train_ds = load_dataset("HuggingFaceM4/COCO", split=f"train[:{coco_dataset_ratio}%]")
valid_ds = load_dataset("HuggingFaceM4/COCO", split=f"validation[:{coco_dataset_ratio}%]")
test_ds = load_dataset("HuggingFaceM4/COCO", split="test")
二、保存数据集
dataset_path = '/content/drive/My Drive/COCO_Dataset_all'
train_ds.save_to_disk(dataset_path + '/train')
valid_ds.save_to_disk(dataset_path + '/validation')
test_ds.save_to_disk(dataset_path + '/test')
三、取出数据集
dataset_path = '/content/drive/My Drive/COCO_Dataset'
from datasets import load_from_disktrain_ds = load_from_disk(dataset_path + '/train')
valid_ds = load_from_disk(dataset_path + '/validation')
test_ds = load_from_disk(dataset_path + '/test')
这篇关于colab中数据集保存到drive与取出的方法的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!