Last Updated on 2021-05-12 by Clay
Today we challenged the classifiers of different data sets again. This time, CIFAR-19 is a more difficult problem than MNIST handwriting recognition. In addition to the size of the picture becoming 32x32, CIFAR-10 is no longer a pure grayscale value, but a picture with the three primary colors of RGB.
As the mission goal becomes difficult, it is no longer a purely fully connected layer to build a model. This time I practiced using classic techniques such as convolution layer and maxpooling (CNN).
A lot of my code is referenced from the official PyTorch Tutorial: https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html#sphx-glr-beginner-blitz-cifar10-tutorial-py
So let's start.
Code explanation
# -*- coding: utf-8 -*- import torch import torch.nn as nn import torch.nn.functional as F import torch.optim as optim import torchvision import torchvision.transforms as transforms
First, you need to import the packages you want to use.
# GPU device = 'cuda:0' if torch.cuda.is_available() else 'cpu' print('GPU state:', device)
Check you can use GPU. If you have no any GPU, you can use CPU to instead it but more slow.
# Cifar-10 data transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
Use torchvision transforms module to convert our image data. It is a useful module and I also recording various functions recently.
# Data trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform) testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform) trainLoader = torch.utils.data.DataLoader(trainset, batch_size=8, shuffle=True, num_workers=2) testLoader = torch.utils.data.DataLoader(testset, batch_size=8, shuffle=False, num_workers=2)
Since PyTorch's datasets has CIFAR-10 data, it can be downloaded here without having to set it manually.
If there is no data folder existed in the current directory, a folder will be created automatically and the CIFAR-10 data will be placed in it.
In addition, batch_size can actually be adjusted by yourself, but the highest accuracy I have tried is 8.
# Data classes classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
10 categories in CIFAR-10.
# Model structure class Net(nn.Module): def __init__(self): super(Net, self).__init__() self.conv1 = nn.Conv2d(3, 6, 5) self.conv2 = nn.Conv2d(6, 16, 5) self.pool = nn.MaxPool2d(2, 2) self.fc1 = nn.Linear(16*5*5, 120) self.fc2 = nn.Linear(120, 84) self.fc3 = nn.Linear(84, 10) def forward(self, x): x = self.pool(F.relu(self.conv1(x))) x = self.pool(F.relu(self.conv2(x))) x = x.view(-1, 16*5*5) x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = self.fc3(x) return x net = Net().to(device) print(net)
Output:
Net(
(conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))
(conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
(pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(fc1): Linear(in_features=400, out_features=120, bias=True)
(fc2): Linear(in_features=120, out_features=84, bias=True)
(fc3): Linear(in_features=84, out_features=10, bias=True)
)
The structure of the model, here is the part that I dare not change at will. I have tried it, but the effect is easy to get worse or not bad at all. I need more time to test.
# Parameters criterion = nn.CrossEntropyLoss() lr = 0.001 epochs = 3 optimizer = optim.SGD(net.parameters(), lr=lr, momentum=0.9)
These are parameter settings. They are loss function (using CrossEntropy of multi-classifiers), learning rate, number of iterations (epochs), and optimizer.
# Train for epoch in range(epochs): running_loss = 0.0 for times, data in enumerate(trainLoader, 0): inputs, labels = data inputs, labels = inputs.to(device), labels.to(device) # Zero the parameter gradients optimizer.zero_grad() # forward + backward + optimize outputs = net(inputs) loss = criterion(outputs, labels) loss.backward() optimizer.step() # print statistics running_loss += loss.item() if times % 100 == 99 or times+1 == len(trainLoader): print('[%d/%d, %d/%d] loss: %.3f' % (epoch+1, epochs, times+1, len(trainLoader), running_loss/2000)) print('Finished Training')
Here is the training process. It should be noted that optimizer.zero_grad()
must clear the gradient every time before updating the weight, otherwise the gradient will always accumulate.
# Test correct = 0 total = 0 with torch.no_grad(): for data in testLoader: inputs, labels = data inputs, labels = inputs.to(device), labels.to(device) outputs = net(inputs) _, predicted = torch.max(outputs.data, 1) total += labels.size(0) correct += (predicted == labels).sum().item() print('Accuracy of the network on the 10000 test inputs: %d %%' % (100 * correct / total)) class_correct = list(0. for i in range(10)) class_total = list(0. for i in range(10)) with torch.no_grad(): for data in testLoader: inputs, labels = data inputs, labels = inputs.to(device), labels.to(device) outputs = net(inputs) _, predicted = torch.max(outputs, 1) c = (predicted == labels).squeeze() for i in range(8): label = labels[i] class_correct[label] += c[i].item() class_total[label] += 1 for i in range(10): print('Accuracy of %5s : %2d %%' % (classes[i], 100 * class_correct[i] / class_total[i]))
Output:
Accuracy of the network on the 10000 test inputs: 55 %
Accuracy of plane : 57 %
Accuracy of car : 72 %
Accuracy of bird : 31 %
Accuracy of cat : 16 %
Accuracy of deer : 53 %
Accuracy of dog : 68 %
Accuracy of frog : 59 %
Accuracy of horse : 65 %
Accuracy of ship : 56 %
Accuracy of truck : 71 %
Here is the test part, we used the data never in training data so we can see our model is really not a random guess.
Complete code
# -*- coding: utf-8 -*- import torch import torch.nn as nn import torch.nn.functional as F import torch.optim as optim import torchvision import torchvision.transforms as transforms # GPU device = 'cuda:0' if torch.cuda.is_available() else 'cpu' print('GPU state:', device) # Cifar-10 data transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]) # Data trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform) testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform) trainLoader = torch.utils.data.DataLoader(trainset, batch_size=8, shuffle=True, num_workers=2) testLoader = torch.utils.data.DataLoader(testset, batch_size=8, shuffle=False, num_workers=2) # Data classes classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck') # Model structure class Net(nn.Module): def __init__(self): super(Net, self).__init__() self.conv1 = nn.Conv2d(3, 6, 5) self.conv2 = nn.Conv2d(6, 16, 5) self.pool = nn.MaxPool2d(2, 2) self.fc1 = nn.Linear(16*5*5, 120) self.fc2 = nn.Linear(120, 84) self.fc3 = nn.Linear(84, 10) def forward(self, x): x = self.pool(F.relu(self.conv1(x))) x = self.pool(F.relu(self.conv2(x))) x = x.view(-1, 16*5*5) x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = self.fc3(x) return x net = Net().to(device) print(net) # Parameters criterion = nn.CrossEntropyLoss() lr = 0.001 epochs = 3 optimizer = optim.SGD(net.parameters(), lr=lr, momentum=0.9) # Train for epoch in range(epochs): running_loss = 0.0 for times, data in enumerate(trainLoader, 0): inputs, labels = data inputs, labels = inputs.to(device), labels.to(device) # Zero the parameter gradients optimizer.zero_grad() # forward + backward + optimize outputs = net(inputs) loss = criterion(outputs, labels) loss.backward() optimizer.step() # print statistics running_loss += loss.item() if times % 100 == 99 or times+1 == len(trainLoader): print('[%d/%d, %d/%d] loss: %.3f' % (epoch+1, epochs, times+1, len(trainLoader), running_loss/2000)) print('Finished Training') # Test correct = 0 total = 0 with torch.no_grad(): for data in testLoader: inputs, labels = data inputs, labels = inputs.to(device), labels.to(device) outputs = net(inputs) _, predicted = torch.max(outputs.data, 1) total += labels.size(0) correct += (predicted == labels).sum().item() print('Accuracy of the network on the 10000 test inputs: %d %%' % (100 * correct / total)) class_correct = list(0. for i in range(10)) class_total = list(0. for i in range(10)) with torch.no_grad(): for data in testLoader: inputs, labels = data inputs, labels = inputs.to(device), labels.to(device) outputs = net(inputs) _, predicted = torch.max(outputs, 1) c = (predicted == labels).squeeze() for i in range(8): label = labels[i] class_correct[label] += c[i].item() class_total[label] += 1 for i in range(10): print('Accuracy of %5s : %2d %%' % (classes[i], 100 * class_correct[i] / class_total[i]))
Read More
- [PyTorch] Tutorial(1) What is Tensor?
- [PyTorch] Tutorial(2) Automatic derivative
- [PyTorch] Tutorial(3) Introduction of Neural Networks
- [PyTorch] Tutorial(4) Train a model to classify MNIST dataset
- [PyTorch] Tutorial(5) How to train a model to classify CIFAR-10 database
- [PyTorch] Tutorial(6) Audio of Processing Module: torchaudio
- [PyTorch] Tutorial(7) Use Deep Generative Adversarial Network (DCGAN) to generate pictures