In this tutorial i am going to explain following
- How to create load data into pytorch tensor for training - How to define the model - How to train the model - How to save and load model - How to do prediction using model

Note: In this tutorial our aim is to show you the whole workflow, because this is something fundamental we won't focus on bringing real life data, building some complex model architecture instead focus will be solely on showing you end to end work flow

Let's create toy dataset

import torch

# Check if CUDA is available and set the device accordingly
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Create a 10x1 tensor with random values and move it to the appropriate device
X = torch.randn(100, 1, device=device)

# Compute y = 3X + 5
y = 3 * X + 5

pytorch imports

# torch.nn module contain functions that help in building neural networks
import torch.nn as nn
from torch.optim import SGD

from torch.utils.data import Dataset, DataLoader
import torch

Create pytorch dataset in format it expect the data

class ToyDataset(Dataset):
    def __init__(self, x, y):
        self.x = torch.tensor(x).float().to(device)
        self.y = torch.tensor(y).float().to(device)
    def __getitem__(self, ix):
        return self.x[ix], self.y[ix]
    def __len__(self): 
        return len(self.x)

ds = ToyDataset(X, y)
dl = DataLoader(ds, batch_size=2, shuffle=True)

Create model

model = nn.Sequential(
    nn.Linear(1, 6),
    nn.ReLU(),
    nn.Linear(6, 1)
).to(device)

Define loss function and train model

# define the loss function that we need to optimize during training process
loss_func = nn.MSELoss()
opt = SGD(model.parameters(), lr = 0.001)
import time
loss_history = []

for _ in range(50):
    for ix, iy in dl:
        opt.zero_grad()
        loss_value = loss_func(model(ix),iy)
        loss_value.backward()
        opt.step()
    loss_history.append(loss_value.item())

To get model summary

from torchsummary import summary
summary(model, input_size=(1,))

make a prediction

val = torch.tensor([[1]]).float()
model(val.to(device))

Save model to disc (Only weights)

It is a good practice to save model after moving it to CPU device, why?
Not everyone have access to GPU, infact GPU has less crucial role during inferencing, there are use cases where CPU inferencing is fast enough even in production settings so if you save model after porting it to CPU, both kind of user later use it .

torch.save(model.to('cpu').state_dict(), '/content/drive/MyDrive/mymodel.pth')

Save model architecture + weights

# Save the entire model
torch.save(model.to('cpu'), '/content/drive/MyDrive/mymodel_full.pth')

loading entire model in eval mode for inferencing

import torch

# Load the entire model
model = torch.load('/content/drive/MyDrive/mymodel_full.pth')
model.eval()  # Set the model to evaluation mode if needed

val = torch.tensor([[1]]).float()
model(val.to(device))

loading model weights only

# i am assuming model architecture is already defined in model varaible in your code
state_dict = torch.load('mymodel.pth')
model.load_state_dict(state_dict)
model.eval()


val = torch.tensor([[1]]).float()
model(val.to(device))

Warning: saving the weights+architecture cause problem when the PyTorch/Python version changes during loading time.

Open Tech

Explanation of Pytorch workflow using a simple example

Let's create toy dataset

pytorch imports

Create pytorch dataset in format it expect the data

Create model

Define loss function and train model

To get model summary

make a prediction

Save model to disc (Only weights)

Save model architecture + weights

loading entire model in eval mode for inferencing

loading model weights only

Let's create toy dataset

pytorch imports

Create pytorch dataset in format it expect the data

Create model

Define loss function and train model

To get model summary

make a prediction

Save model to disc (Only weights)

Save model architecture + weights

loading entire model in eval mode for inferencing

loading model weights only

links

social