献给莹莹
该文章适用于两种方法读取数据集,采用Pytorch框架
参考文献:https://www.cnblogs.com/denny402/p/7512516.html
1.数据均放在同一文件夹中
这种情况下,数据均在同一文件夹下,标签存储在json或txt文件中,我们需要读取数据的同时读取相应的标签
from torchvision import transforms, utils
from torch.utils.data import Dataset, DataLoader
import matplotlib.pyplot as plt
from PIL import Image
def default_loader(path):
return Image.open(path).convert('RGB')
class MyDataset(Dataset):
def __init__(self, txt, transform=None,loader=default_loader):
fh = open(txt, 'r')
imgs = []
for line in fh:
line = line.strip('\n')
line = line.rstrip()
words = line.split()
imgs.append((words[0],int(words[1])))
self.imgs = imgs
self.transform = transform
self.loader = loader
def __getitem__(self, index):
fn, label = self.imgs[index]
img = self.loader(fn)
if self.transform is not None:
img = self.transform(img)
return img,label
def __len__(self):
return len(self.imgs)
train_data=MyDataset(txt='mnist_test.txt', transform=transforms.ToTensor())
data_loader = DataLoader(train_data, batch_size=100,shuffle=True)
说明:
__getitem__
和__len__
,python魔法函数
2.数据放在不同的文件夹中
即以下情况,这种情况比较简单
直接调用torchvision里面的ImageFolder
import torch
import torchvision
from torchvision import transforms, utils
img_data = torchvision.datasets.ImageFolder('D:/bnu/database/flower',
transform=transforms.Compose([
transforms.Scale(256,256),
transforms.CenterCrop(224),
transforms.ToTensor()])
)
data_loader = torch.utils.data.DataLoader(img_data, batch_size=20,shuffle=True)
3.利用os.listdir
函数
os.listdir
函数会返回path
下所有的文件,文件夹名字